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ABSTRACT 

The general problem addressed in this thesis concerns 
formative evaluation relevant to curriculum develojMnent. The research 
stx-.tegy was that of an aptitude- treatment interaction (ATI) study. 
Aptitude was defined in terms of the individual's ability to learn 
specific concepts associated with a unit of length measurement. The 
treatments were designed to differ only in their emphasis on a unit 
of area measurement. The specific question asked %ms: In what manner 
does the ability of children to learn concepts associated with a unit 
of length affect the extent to which they attain concepts associated 
with area and a unit of area for each of the two given treatments? In 
order to determine this ability^ 90 second and third graders were 
subjected to a teach-test procedure. This procedure consisted of a 
pretest^ a brief instructional treatment and a posttest^ all of which 
tested or taught about a unit of length. The results of the two tests 
were used to determine the aptitude levels. No significant 
interactions were found between the aptitudes and treatments on any 
of the measures. There were significant main effects due to aptitude 
and to treatment for achievement and retention measures. Other 
findings relevant to curriculum development reported in this study 
are: (1) It is feasible to teach these area concepts to second and 
third graders, and (2) Second and third graders are capable of 
handling conflicting situations involving units of area. 
(Author/CK) 
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STATEMENT OF FOCUS 



Individually Guided Education (IGE) is a new comprehensive system 
of elementary education. The following components of the IGE system 
are m varying stages of development and implementation: a new 
organization for instruction and related administrative arrangements; 
a model of instructional programing for the Individual student: and 
curriculum components in prereading, reading, mathematics, motivation, 
and environmental education. The development of other curriculum 
components, of a system for managing instruction by computer, and of 
instructional strategies is needed to complete the system. Continuing 
programmatic research is required to provide a sound^nowledge base lor 
the components under development and for improved second generation 
components. Finally, systematic implementation is essential so that 
the products will function properly in the IGE schools. 

The Center plans and carries out the research, development, and 
implementation components of its IGE program in this sequence: 
identify the needs and delimit the component problem area; 
(2) assess the possible constraints-financial resources and availability 
or staff; {3) formulate general plans and specific procedures for 
solving the problems; (4) secure and allocate human and material 
ZlnTT '° "'^^/"^ plans; (5) provide for effective communication 
among personnel and efficir.nt management of activities and resources; 
and (6) evaluate the effectiveness of each activity and its contri- 
bution to the total program and correct any difficulties through 
feedback mechanisms and appropriate m.anagement techniques. 

A self-renewing system of elementary education is projected in 
each participating elementary school, i.e., one which is less dependent 

of S'cM'ld'^T responsive to the needs 

of the children attending each particular school. Jn the IGE schools. 
Center-developed and other curriculum products compatible with the 
Center s instructional programing model will lead to higher u,or«le 
nroHnr? "^'^^^f ^°"8 cducatjonal personnel. Each developmental 
product makes its unique contribution to IGE as it is implemented in 
the schools The various research components add to the knowledge of 
Center practitioners, developers, and theorists. 
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ABSTRACT 



The general problem addressed in this thesis concerns formative 
evaluatiou relevant to curricu^'un development. More specifically, the 
process of measuring as it relates to primary students was the portion 
^^mathematics curriculum considered. This was narrowed to examining 
the interaction of two treatments on measuring area with various lev-Is 
of aptitude. The research strategy was that of an iptitude-treatment 
interaction (ATI) study. 

Most of the past educational ATI studies prescribe treatments which 
differ on form or mode and capitalize on the learners' best general 
abilities. This study approached the ATI question in a different 
manner. General capabilit^es were not used as me?.cure of aptitude and 
the treatments did not differ in form or mode. Aptitude was definod in 
terras of the indi^^idual's ability to learn specific concepts associated 
with a unit of length measurement. The treatments were designed to <iif^ 
only in their emphasis on a unit of area measurement. The specific 
question asked was: In what manner does .he ability of children to 
learn concepts associatec with a unit of length affect the extent to 
which they attain concepts associated with area and a unit of area for 
each of the t -o given treatments? 

In order to determine this ability 9& second and third gr..d_rs 
were subjected to a teach-test procedure. This procedure consLsed of 
a pretest, a brief instructional treatment and a posttesc all of which 
tested or taught about a unit of length. The results of the two "estc 
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were used to determine the aptitude levels. Although three levels of 
aptitude were expected only two subjects met the criteria for one of 
the levels. They were dropped from tne remainder of the study along 
with those students who did not fit the definition of either of the other 
two levels. The students in each of the other two levels (32 in Level 
I and 27 in Level II) were randomly assigned to the two treatments. 

Both treatments had the same behavioral objectives, the same 
teacher, the same duration (9 days) and the same mode of instruction. 
They differed in their treatment of the unit of measure for area. 
After the treatments three measures, achievement, transfer and reten- 
tion, were taken. These measures were used to test h^i^o theses about 
the interaction of aptitude with treatments and about the main effects 
of aptitude and of treatment. 

No significant interactions were found between the aptitudes and 
treatments on any of the measures. There were significant main effects 
due to aptitude and to treatment for achievement and retention measures. 
In interpreting these results one must take into account that one level 
of aptitude was not found in the given population. It had been hypoth- 
esized that much of the interaction would have been due to this level. 

Other findings relevant to curriculum development reported in this 
study are: 1) It is feasible to teach these area concepts to second 
and third graders, 2) Second and third graders are capable of handling 
conflicting situations involving units of area. 
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Chapter I 
INTRODUCTION TO THE THESIS 



Introduction 

The main purpose of this study was to examine the interaction 
of two treatments on measuring area with various levels of aptitude. 
This first sentence contains many words (i,e,, area, measuring and 
aptitude) which may be interpreted in a variety of ways. To mini- 
mize misMnderstandings or misinterpretations throughout this thesis 
the first chapter begins by defining or explaining crucial terms. 

The remainder of this chapter gives an overview of the study 
and of the thesis. Both the general problem considered and the 
specific problem investigated are identified and a brief descrip- 
tion of the experiment is included. To complete this introductory 
chapter the remaining chapters are outlined. 

Crucial Tern s 

Aptitude : "Any characteristic of the individual that increases 
(or impairs) his probability of success in a given (educational) 
treatment." (Cronbach and Snow, 1969, p. 7) The characteristic ex- 
amined in this study was the child's ability to learn certain mathe- 
matical concepts about a unit of length. This ability was deter- 
mined through a procedure known as a teach- test procedure. 
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Aptitude Levels , Tliree levels of aptitude were used to classify 
subjects as follows: 

Level I Subjects who had not attained the specified behaviors 
by the exA of the teach-test procedure and evidenced 
no change in performance from pretest to posttest. 
Level II Subjects who attained the specifi'id behaviors only 
after the teach-test procedure and eviderced change 
in performance from pretest to posttest. 
Level III Subjects who had already attained the specified be- 
haviors before and maintained them throughout the 
teach-test procedure. 
Aptitude-Treatment Interaction (ATI) ; The interaction between 
learning abilities (aptitudes) and instructional treatments. Studies 
which investigate this i.^ueraction are known in the literature as ATI 
studies . 

Attributes ; Characteristics or properties of objects or sets. 
This study is concerned only with the attributes of length and area 
and only these as they are described here. 

Length: An undefined attribute; perceptually the dominant attri- 
bute of long, thin objects and mathematically represented 
by a line segr*. <t. 
Area; An undefined attribute; perceptually a dominant attribute 
of flat objects and mathematically represented by a planar 
region. 
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Curriculum . A broad view of curriculum is taken in this thesis. 
Mathematics curriculum is composed of four components: mathematics 
program, learner, teacher, and instruction (Romberg and DeVault, 1967) 
and is defined by specifying the components and the interrelations 
among them* 

Developing Mathematical Processes (DMP ) . A K-6 elementary mathe-- 
matics program being developed by the Analysis of Mathematics Instruc- 
tion (AMI) Project at the Wisconsin Research and Development Center 
for Cognitive Learning. The program is based on a measurement ap- 
proach to mathematics and mathematical processes are emphasized. The 
instructional treatments for this study were developed to reflect 
this approach to mathematics (Romberg, Fletcher and Scott, 1968). 

Measure-Measurement . In this thesis a distinction is made be- 
tween measure and measurement. A measurement is a symbolic repre- 
sentation of an attribute which includes both the unit and the num- 
ber assigned by the process of measuring. A measure is a symbolic 
representation of an attribute consisting only of the number. 

Processes. In this study only the following processes as de- 
fined in DMP were emphasized. 

Comparing: The process of deciding whether two objects are 



alike with respect to a stated attribute. 



Ordering: 



The process of deciding the direction of the 



difference between two objects with respect 



to a stated attribute. 
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Representing: The process of denoting or expressing in a 



different medium an attribute of an object 
or a set. If the medium is physical, a 
physical representation is made* Similarly, 
if the medium is a picture or a symbol then 
a pictorial or symbolic representation is made* 



object. These steps are followed: 

1) A set of objects is recognized as 
possessing the attribute, 

2) the procedures for comparing, ordering, 
and combining the objects are established, 
and 

3) a unit is specified, thus assigning to each 
object of the domain a non-negative number. 

Teach-test Procedure, (T-T) ; A particular procedure which sim- 
ulates the actual classroom environment in order to attain a measure 
of the student's ability to learn. The procedure consists of a pre- 
test, a bri^if instructional unit, and a posttest. In this study 
this rrccedure was used to determine the aptitude levels. 

Treatments . There were two treatments designed for the study. 
No attempt is made here to describe fully the treatments; the intent 
of this statement is only to indicate that Treatment U emphasizes 



Measuring: 



The process of assigning a numeral as a sym- 



bolic representation of an attribute of an 




ERIC 



5 



ERIC 



the unit and Treatment N does not. For a complete description of the 
treatments see Chapter V and Appendix C, 

From the General to the Specific Problem 

"To geneiate knowledge about mathematics instruction and to in- 
corporate it into a validated instructional program*' (Romberg and 
Harvey, 1969, p. 1) is the stated purpose of the project. Analysis 
of Mathematics Instruction, which is developing the program Develop- 
ing Mathe matical Processes (DMP ), This curriculum program approaches 
elementary number concepts through measurement. This is not a common 
approach and little research exists which is relevant to some of the 
problems posed by this approach. This study is part of a series of 
studies (Scott, 1969; Gilbert, 1969; Weinstein, 1970; Carpenter, 
1971; and Carpenter, in press) which are designed to generate know- 
ledge about measurement instruction so that it may be incorporated 
into DMP. 

\ \) 

. >^ the time this study was being planned and executed, instruc- 

tional topics on area were being developed and questions about the 
,<^^ child's understanding of the role of a unit in the process of measure- 
^^^^ ^^^^ being raised. The research on area and on the role of the 
unit is even more scarce than research on many other measurement ques- 
tions. Thus, the investigator decided to examine the role of the 
unit as it applied to measuring area. 

Furthermore, DMP is the mathematics component of Individually 
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Guided Education (IGE) , the focus of the Wisconsin Research and 
Development Center for Cognitive Learning (Klausmeier, Quilling, 
and Sorenson, 1971). In developing an individually guided mathe- 
matics program based on research it seems natural to respond to 
Cronbach and Snow's (1969) pleas for research which formulates more 
precisely the ways in which programs can be varied so as to fit 
learners' characteristics. Therefore, the method of research 
adopted to investigate the role of a unit in m-asuring area was an 
aptitude-treatment interaction study. 

Most of the past educational ATI studies fit what Salamon 
(1971) has described as the preferential model. This model pre- 
scribes treatments differing on form or mode and crpitalizes on the 
learners' best general capabilities. However, few of these studies 
have been successful in producing the desired interaction (Bracht, 
1969). 

This study approached the ATI question in a different manner. 
General capabilities were not used as measures of aptitude and the 
treatments did not differ on form or mode. Aptitude was defined in 
terms of the individual's ability to learn specific concepts asso- 
ciated with a unit of length measurement. The treatments were de- 
signed to differ on their emphasis on the unit of area measurement. 
Thus, the specific question asked was: In what manner does the abil- 
ity of children to learn concepts associated with a unit of length 
affect the extent to which they attain concepts associated with area 
and a unit of area for each of the two given treatments? 



7 



Brief Description of the Experiment 

The basic research design of an aptitude-treatment interaction 
experiment is one in which the identified aptitude is measured, the 
subjects are randomly assigned to a treatment, and the outcome is 
measured (Cronbach and Snow, 1969, p. 21). 

Therefore, the first step after identifying the aptitude is to 
measure it. Since aptitude was defined as the child's ability to 
attain certain behaviors concerning a unit of length, the following 
procedure (the teach-test procedure) was used to measure aptitude. 
On the stated objectives concerning length 110 second and third 
graders in one school were given a pretest, were instructed for two 
days, and were given a post test. From the results of the tests, sub- 
jects were classified into two levels: Level I, those students who 
had not attained the specified behaviors prior to or after instruc- 
tion and who evidenced no change in performance; Level II, those 
students who had not attained the specified behaviors prior to in- 
struction, but had attained them after the instruction and who had 
evidenced change in performance. Because of previous experimentation 
a third level of aptitude, those students who had attained the spec- 
ified behaviors prior to instruction, was expected. Only two chil- 
dren in the experiment's population fitted this description and, 
therefore, were dropped from the study. Those students who were 
absent during the testing or instruction or who did not fit in either 
Level I or Level II were also dropped. (A complete description of 
the criteria used to determine the levels is found in Chapter V.) 
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The remaining subjects in Level I and in Level II were assigned 
randomly to two treatments. The two treatments were designed to 
interact with the aptitude. Since the aptitude reflected the abil- 
ity to learn concepts about a unit of length, one treatment (Treat- 
ment U) was developed to emphasize similar concepts about a unit of 
area and the other (Treatment N) was developed to de-emphasize such 
concepts. Both treatments were activity oriented, were taught by 
the same teacher, were monitored by the experimenter and consisted 
of nine forty-minute sessions. 

Upon completion of the instructional treatments three measures, 
achievement, transfer and retention, were taken. The results, anal-- 
ysis and implications are discussed in full in the last three chap- 
ters. 

Outline of the Remainder of the Thesis 

Chapter II demonstrates the significance of the general problem 
and draws a relation of the specific problem to the general problem. 
The discussion of the specific problem includes an analysis of it, 
the rationale, the proposed questions, and its significance. Chap- 
ter III is a summary of research directly relevant or indirectly re- 
lated to the study. 

Chapter IV contains the considerations made in designing and 
the plans made for conducting the study. The pilot studies and their 
effects on the plans for the final study are reported. The design, 
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the population, the teach^test procedure, the treatments, the instru- 
ments, the hyp^'^heses and the analysis are presented as they were 
proposed for cue study. 

In Chapter V the actual execution of the experiment is described. 
The population is characterized, the summary of the teach- test pro- 
cedure is given, the results of the teach-test observations are re- 
ported, the aptitude levels are defined, the daily description of the 
treatments is included, and the results of the treatment observations 
are reported. 

Chapter VI is a report of the statistical analysis. Chapter VII, 
the concluding chapter, includes interpretations of the analysis, a 
summary of the study and its implications, and the projections for 
further study. 



Chapter 
THE PROBLEM 



Introduction 

The specific question asked in this study was: How does chil- 
dren's ability (aptitude) to learn concepts about a unit of length 
interact with alternative treatments on measuring area? In this 
chapter the derivation of this question from the general problem is 
reported. First the general problem and its signifi2ance is identi- 
fied. After the necessary background for understanding the specific 
problem is given, the specific problem is identified. Next a des- 
cription of the research strategy used ro investigate the specific 
problem and its influence on the final question is given. The treat- 
ments are briefly described and the method for determining the apti- 
tude is explicated so that the specific question may be interpreted 
accordingly. 



General Problem 

"Evaluation, used to improve the course while it is still fluid, 
contributes mure to improvement of education than evaluation used to 
appraise a nrodurt once it is on the market." (Cronbach, 1969, p. 364) 
Many similar quotes by those who promote what Scriven (1967) has called 
formative evaluation may be found, but few reports of such studies are 
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available in published literature. This partly is du^ to one function 
of formative studies-the changes are made in the product according 
to the findings and these constitute the report of the study. 

However, with the increase of federal and private corporations' 
monies for research and development, more encouragement is being giv- 
en formative studies of curriculum projects (Kirst and Walker, 1971) 
anJ the reports thereof. The products being developed at Research 
and Development Centers or at Regional Laboratories fall ir.to this 
category. One sue. product is Developing Mathematical Processps (DMP) , 
a mathematics program for grades K-6, being developed by the mathe- 
matics project (Analysis of Mathematics Instruction Project) at the 
Wisconsin Research and Development Center for Cognitive Learning. 

The underlying purpose of the Analysis of Mathematics Instruc- 
tion Project, "to generate knowledge about mathematics instruction 
and to incorporate it into a validated mathematics program," (Romberg 
and Harvey, 1969) establishes the need for such research in connec- 
tion with this project. The investigator, working for the project, 
assumed this need and the significance thereof, and proceeded to look 
at a portion of this general problem-the portion concerning instruc- 
tion in the process of measuring. 

Background 

To understand the specific problem it is necessary to have some 
background of DMP's approach to mathematics, the process of measuring. 



12 



and the role of the unit in measuring. 

DMP's approach to mathematics . DMP approaches elementary mathe- 
matics through measurement (Romberg, Fletcher, and Scott, 1968). This 
is not a common approach; most contemporary elementary mathematics 
programs use a set theoretical approach. That is, most programs t3- 
gin with counting sets of objects to attain mastery of number concepts, 
but DMP begins with the processes underlying measuring and eventually 
a measure (number) is assigned to a given attribute of an object or 



of sets. 
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These processes underlying mensuring-coniparlng, ordering, and 
representing-are three of the processes emphasized in DMP. The pro- 
gram begins in kindergarten, after a topic in which objects are des- 
cribed and classified on attributes familiar to children, by focusing 
on the attribute ot length-an attribute perceptually salient to chix- 
dren of this age. First, the children compare and order lengths di- 
rectly by placing the objects side by side. When these processes are 
mastered in this cOnt:ext, the children are presented the problem of 
comparing two objects which cannot be placed side by side. The pro- 
cess of physically representing a length is introduced to help them 
solve this problem. In refining this process, the procedure of lay- 
ing units end to end in order to represent a length is learned. The 
children proceed to pictorialiy representing lengths and, finally, 
to symbolically representing lengths. This last step is taken late 
in kindergarten or early in the first grade after the same processes 
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have been considered with the attribute of numerousness. Throughout 
the prcgram these same processes, along with others, are reconsidered 
with many other attributes-;»eight , capacity, duration, area, etc. 

Symbolic stimuli which may be interpreted as representing attri- 
butes or processes are not given until after the children have been 
exposed to related physical or pictorial stimuli. Thus, at t'>e early 
levels DMP depends heavily on the child's ability to model a symbolic 
statement physically or pictorially and not upon manipulating symbols. 

The measurement approach complements DMP's other pedagogical 
strategies-use of physical referents, problem-solving, functional 
transition, and spiraling techniques (Romberg, Fletcher, and Scott, 
1968). It provides a wide variety of physical referents by consider- 
ing attributes other than just numerousness. It allows the children 
to solve problems with perceptually meaningful materials; problems 
similar to those they will later be presented syF.bolically . Many of 
the functional transitions to a new process or to a new skill can be 
made through measurement problems which the child can solve at an 
early age. The introduction of neu attributes permits spiraling 
through the processes at a pace suitable to the child's development. 

This description of DMP is not intended to be exhaustive; it 
only includes those points relevant to the specific problem. For a 
complete description the following papers may be referenced (Romberg 
and Harvey, 1969; Harvey, Romberg, and Fletcher, 1969). 

The process of measuring. The process of measuring may be con- 
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sidered as one of mapping empirical properties or relations into a 
formal model (Stevens, 1959, p. 20) • This definition requires a set 
A of elements which possesses the properties and relations under con- 
sideration, a set B which possesses a formal structure, and a pro- 
cedure for associating each element of A with an element of B in such 
a way that the essential properties and relations of set A are pre- 
served (Blakers, 1967). While a more precise definition is necessary 
for the study of all measurable functions, this definition is suffi- 
cient for most common measure functions. Since the measure functions 
in this study are related to length and area, the remainder of this 
discussion is relevant to such conmon functions. 

The first step in the measuring process is identification of a 
domain of elements which possesses the attribute under question. 
(For example, the domain may be the set of all objects which possess 
the attribute of length.) Through empirical procedures a recogniz- 
able structure is imposed upon this domain. An equivalence relation 
is established through a procedure for comparing the elements. (For 
the attribute of length this amounts to deciding whether or not the 
objects are the same length.) This equivalence relation partitions 
the domain into equivalence classes. Further, a strict total order 
relation is established through the procedure of ordering elements 
from the equivalence classes. (For the attribute of length two equi- 
valence classes may be ordered by ordering representative objects 
from each class, that is, by deciding which object is longer.) ^ it- 
ly, a binary operation if; defined which is associative and commutative. 
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(in the case of length, this operation may be described as "combining" 
or joining two objects end to end.) Through these relations and this 
operation the recognizable structure of an ordered abelian semi-group 
is imposed upon the domain. 

The next step in the process of measuring is specifying the for- 
mal model, that is, indicating a set which possesses a formal struc- 
ture. For common measure functions -this set is the non-negative real 
numbers. The only step remaining is the definition of a function 
which preserves the essential character is tics of the domain and 
assigns each element of the domain to a non-ne^attve real number. 
This assignment is completely defined by specifying a unit, that is, 
by identifying an equivalence class whose image is 1. Thus, an iso- 
morphism is set up between the domain and tne set of non-negative 
real numbers which preserves the operation and order relation imposed 
on the domain. 

Therefore, in the process of measuring leigth, area, or any 
other common measurable attribute the same steps are followed: 

1) A set of elements (the domain) is recognized as possessing 
the attribute, 

2) tlie procedures for comparing, ordering, and combining the 
elements of the domain are established, and 

3) a function is defined which preserves the essential 
characteristics of domain and assigns to each element 
of the domain a non-negative real number. 
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This study is concerned with all three steps, although the func- 
tion is always defined simplistically as the counting function. That 
is, each element of the domain is mapped onto a non-negative integer 
by counting the nuirber of units combined so as to be equivalent (to 
the nearest unit) to that element. 

The role of the unit . As stated in the previous section, the 
assignment of each element to a non-negative real number is defined 
by specifying an equivalence class to be the unit. Technically, each 
time a different equivalence class is specified to be the unit, a dif- 
ferent function is defined. For example, if the unit of length is 
specified to be two inches, then this measure function map?^ all lengths 
of two inches onto 1, all lengths of twelve inches onto 6, etc. But, 
if the unit of length is chosen to be four inches, then this function 
maps all lengths of four inches onto 1, all lengths of twenty-four 
inches onto 6, etc. 

If in comparing two measures one fails to realize that different 
functions may have been used in assigning the measures, that is, dif- 
ferent units have been specified, then one may reach an erroneous con- 
clusion. For example, if the fact that two different units have been 
used in the previous examples is ignored, the conclusions may be reach- 
ed that a length of twelve inches is equal to a length of twenty-four 
inches since they both correspond to 6 when mapped (by the first and 
second functions, respectively, as defined in the previous examples) 
onto the non-negative reals. 
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The Russians, Gal'perin and Georgiev (1969), addressed themselves 
to this problem. They proposed a program which began the study of 
numbers by focusing on the unit. Their experimentation found that 
the experimental group who were taught about units achieved superior 
results on the 14 measurement and conservation tasks administered. 

Not many elementary series in this country approach this prob- 
lem. Most of their problems are structured so that only one func- 
tion is being used at a time. This procedure is not consistent with 
DMP's approach of letting the children generate their own numbers 
and problems, since then there is no guarantee that only one function 
will arise within a problem. 

Specific Problem 

In the process of developing a curriculum there are many de- 
cisions to be made. Some of these decisions are made subjectively, 
some are derived logically from basic assumptions, and some are made 
using empirical evidence. While the investigator was working on the 
development of DMP, the need to examine the attribute of area arose. 
Although the decision to include area after the attributes of length, 
numerousness, and weight and to treat area in the same manner as these 
attributes had been treated, no firm decisions had been made about the 
placement or the particulars of the activities in the total instruc- 
tional program. 

In late spring of 1971 the investigator worked with first and 
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second graders on activities concerning area. Both a manipulative 
approach and a more traditional rule-example approach were tried. 
At the end of the activities (a more complete description is given in 
Chapter IV), both groups could exhibit the specified behaviors. How- 
ever, in individually testing the subjects, the investigator noted 
that only some children from both groups had difficulty on the trans- 
fer items which asked for a comparison of two regions not covered with 
a connnon unit. Some children focused only on the number of units, 
but others coordinated the size of the unit and the number. Likewise, 
when measuring a region covered with non-congruent units, some chil- 
dren respond'.d only with the number of pieces but others coordinated 
the relationships between the pieces to respond correctly. If ques- 
tioned more thoroughly about the unit, some children who initially 
responded incorrectly then focused on the unit and responded cor- 
rectly. 

These responses were not too surprising; other studies (Carpenter, 
1971; Gal'perin and Georgiev, 1969; and Piaget, 1964) have reported 
similar responses. However, they did clearly point out that some 
children are ready to assign a number to area and others need more 
experience with the unit. 

Working within the Individually Guided Education (IGE) framework 
of the Wisconsin Research and Development Center, the next natural 
questions were: (1) How to determine those individuals who needed 
and those who did not need more experience with the unit and (2) what 
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treatments would be necessary for each group. 

Thus, the specific problem was raised: How does children's 
ability to learn specified concepts about a unit of measurement inter- 
act with alternative treatments, one of which emphasizes the unit 
and the other of which does not? 

In order to answer this question the research strategy of an apti- 
tude-treatment interaction study was invoked. The next section des- 
cribes this strategy and its influence on this study. 

Research Strategy 

This specific problem is an instance of the general aptitude- 
treatment interaction question: "In what manner do the character- 
istics of learners affect the extent to which they attain the out- 
comes for each of the treatments that might be considered?" (Cronbach 
and Snow, 1969, p. 6) 

Such a strategy calls for specifying aptitudes and designing 
treatments which interact with these aptitudes. That is, a treat- 
ment A is designed that is better for a learner with given charac- 
teristics and a treatment B is designed that is better for a learner 
with other characteristics. For a complete description of this strat- 
egy the summary by Cronbach and Snow (1969) is referenced. 

Although many supporters of individual differences have prompted 
this strategy, few educational studies have produced the desired 
interaction. Bracht (1969) analyzed 90 ATI studies in terms of their 
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treatments, aptitudes (personological variables), and dependent var- 
iables in hopes of explaining the lack of disordinal interactions. 
However, only 5 of the possible 108 results of these studies showed 
disordinal interaction. Hence, only mild support was given to any 
hypotheses he made about the different types of variables. His work 
serves as a caution to those designing aptitude-treatment studies— 
the treatments must be carefully designed to take full advantage of 
the specified aptitudes. 

Cahen (1969) urges those involved in ATI studi'^s to make a care- 
ful analysis of the learning tasks in order to develop treatments and 
to create relevant aptitude measures. The analysis for this study 
and the resulting treatments are described in the next section. After- 
wards, the creation of an aptitude measure relevant to these treat- 
ments is presented. 

Treatments 

Before designing the treatments the terminal objectives were 
specified. Then an analysis waj made of the process of measurinp area 
and coordinated with DMP's approach of spiraling through physical, 
pictorial, and symbolic representations. The result was a flow chart 
of behaviors as shown in Figure 2.1. These behaviors were later 
organized into instructional objectives (see Chapter IV). 
The terminal objectives were: 

1) Given a region and a covering of the region, the student 
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assigns a measure to the area of tfiat region (see 
box 12 in Figure 2.1). 
2) Given the measures of the areas of two regions, the 
student compares and orders the regions (see box 13 
in Figure 2.1). 

The two objectives in boxes 10 and 11 are those concerning the 
unit. The objective in box 10, identifies the need for knowing both 
the unit and the number, requires that the child focus on both the 
unit and the number. If he is always presented situations in which 
the number cue is correct then there is little need for him to focus 
on both. The objective in box 11 is closely related to the one in 
box 10. If two areas are being compared the comparison is simpler 
if each region has been covered with congruent units and these units 
are the same for both regions. Then, and only then, is it sufficient 
to compare only the measures. 

This study hypothesized that some second and third graders do 
not need to be taught these objectives, some are ready to learn about 
them and others are not ready to assimilate these behaviors. Thus, 
the treatments were designed to either expose (Treatment U) or not 
expose (Treatment N) the subjects to these behaviors. Otherwise the 
treatments were held as constant as possible— the same length of time, 
the two treatment groups were randomly selected from a given popula- 
tion, the same teacher, the same instructional mode, and the same 
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terminal ob j ectives . 
Aptitudes 

Aptitude may be defined as "any characteristic of the individual 

that increases (or impairs) his probability of success in a given 

treatment." (Cronbach and Snow, p. 7) The characteristic of the 

individual identified in this study was the ability to learn about a 

unit of measurement. In order to determine this ability a teach-test 

procedure was used. This procedure consists of a pretest, a brief 

instructional unit, and a posttest. The underlying assumption of 

this procedure is stated in the following: 

This procedure is ba-^ed on the supposition that if 
a student is unfamiliar with the content of the 
instructional unit, and if it c^n be reasonably 
assumed that he already has the background nec- 
essary for learning It. then his performance on 
the unit test will provide valid measure of his 
ability to learn mathematics; this measure could, 
in turn, be interpreted to be a valid measure 
of his mathematical apcitude. (Heimer and Lottes, 
p. 1-2) 

Since the instructional unit for the teach-test procedure is brief, 
it is necessary to limit the objectives of this instruction. To 
help accomplish this the number of attributes was limited in this 
study. It was decided to use only one attribute. The attribute of 
area could not be used because any instruction concerning the unit 
of area would interfere with the differences between the two treat- 
ments. The attribute of length was chosen for several reasons. 
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Figure 2.1. Flowchart for the Attribute of Area 
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Figure 2.2. Flowchart for the Attribute 
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As indicated in the discussion of the process of measuring, 
measuring length and area involve the same steps, A flow chart for 
length (see Figure 2.2, derived in the same way as the one for area) 
shows the relation between the two attributes with respect to the 
behaviors under question. 

While the steps to reach the terminal objectives (Figure 2,2, 
box 13) of comparing and ordering measures of length are the same as 
those for area, the behavior necessary for some of the subordinate 
objectives differ. The main differences are between the behaviors 
indicated in boxes 1 and 6 in both flow charts. Length is a much 
more easily identified attribute than area for children (see box 1 
in Figures 2.1 and 2.2). Likewise, physical representation v^ith 
discrete units is a much simpler procedure for length than for area 
(see box 6 in Figures 2.1 and 2.2). Thus, there is a strong logical 
and instructional relation between length and area, but the expected 
behaviors are easier to attain for length. 

Furthermore, most second and third graders have attained many 
of the behaviors indicated in Figure 2.2. Prerequisite behaviors to 
the objectives concerning the unit are more likely to have been 
attained for length than for any other attribute. 

After two tryouts of the teach- tect procedure these objectives 
were specified: 

1) Given two lengths, the student indicates that he must 
know both the number and the unit before he can compare 
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and order the lengths (see box 10 in Figure 2.2). 

2) Given two lengths whose measurements have been expressed 
in different units, the student compares and orders 
these lengths by taking into account the relationship 
between the units (see box 11 in Figure 2.2). 

These objectives were chosen for the following reasons: 

1) They were found to be feasible from the pilot study. 

2) They require only prerequisite behaviors which most 
second and third graders have mastered. 

3) They are not stressed in elementary school programs and, 
hence, the interaction between the teach-test instruc- 
tional unit and previous instruction would be minimal. 

4) They are essential to the understanding of assigning 
measures to length and of comparing and ordering 
measures of length. 

5) They are objectives specific to the unit. 

6) They correspond to the differences between the treatments. 
Treatment U emphasizes the unit of area in a manner 
similar to the teach-test treatment of those objectives 
and Treatmenf: N does not make this emphasis. 

The teach-test procedure was used in the following way to clas- 
sify students into aptitude levels. Level I: Any student who had 
not attained the specified behaviors by the end of the teach-test 
procedure and who had not evidenced change in performance. Level II: 
Any student who attained the specified behaviors only after the teach- 
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test procedure and who evidenced change in performance. Level III: 
Any student wh-. had attained the specified behaviors before the teach- 
test procedure and maintained them throughout. 

Specific Questions 

In light of the definition of aptitude in this study the specific 
problem may be restated: How does children's ability, as determined 
by a teach-test procedure, to learn concepts about a unit of length 
Interact with the two treatments on measuring area? Three dependent 
variables, achievement, transfer and retention, were identified. 
Thus, the question of Interaction was asked with respect to each 
dependent measure. 

In addition to these questions of interaction, several other 
questions were asked: 

Is the teach-test procedure a valid predictor of an individual's 
success? 

Is it feasible to teach these area concepts to second and third 
graders? 

Is it feasible to teach about a unit of area or a unit of length 
to these students? 

To what extent are the performances on achievement correlated 
with performances on retention? 
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Chapter III 
RELEVANT AND RELATED RESEARCH 



Introduction 

While reviewing the educational and psychological literature one 

begins to feel that one is attempting to prove the following theorem: 

the amount of research in measurement is inversely 
proportional to the amount of use of measurement 
and support of its use in elementary school. 

The amount of research is reflected in Suydam and Riedesel's (1969) 
interpretative study of research in elementary scnool mathematics in 
which only three of the reported 305 studies dealt solel, with measure- 
ment. Only four more were cross-. aferenced with measurement. Although 
there is an abundance of pre-measurement research such as Piagetian 
conservation research, there has been little attempt to relate it to 
instruction. In one of the few attempts to relate Piagetian pre- 
measurement and measurement research to instruction, Huntington (1970) 
analyzed the instructional sequence of linear measurement in School 
Mathematics Study Group. Book 1. He found many discrepancies between 
this sequence and Fiagefs developmental stages. Even if such dis- 
crepancies exist i: i.s not clear at this time how curriculum developers 
could best use Piagetian research. Weaver (1972) in an article con- 
cerning the relevance of Piagetian research to instruction cautions 
mathematics educators: 
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How significant it is that we have strained so 
hard to generate implications for classroom in- 
struction in mathematics from research that thus 
far - with relatively few exceptions - could not 
have cared less for the classroom context and 
for the learning of mathematics by children with- 
in that context (p. 269). 

Likewise*, in reporting what research says about the use of measurement 

in other subjects, Suydam and Rier^-sei (1969) suiranarized: 

All the researchers agrec^ that greater emphasis 
should be placed upon understanding of basic 
quantitative concepts taught in elementary school 
mathematics (p. 117). 

Although the research related to measurement instruction in pro- 
portion to its use and support is scarce, there are three branches of 
this research which are particularly relevant to this study. These 
are reviewed first in this chapter; afterwards the research related 
to aptitude-treatment interaction and that related to the teach-test 
procedure ±b reviewed. 

Relevant Measurement Literature 

The three branches of measurement research directly relevant to 
this study are research concerning: the role of the unit, the 
relation between measuring length and measuring area, and the measuring 
of area. Each of these is reviewed here. 

The role of the unit . Ellison (1972, p. 171), in one of the few 
articles addressed to the role of the unit, warns of potential trouble 
that may occur with a unit. The main thesis of his article is ^he 
confusion that may arise if the unit is thought of as an identity 
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and is not conceived of as subject to change within specific problems. 
In surveying the current elementary textbooks, the investigator found 
that these sources of difficulty are not approached directly. In 
fact, there is little emphasis on the unit. Many of the teacher 
manuals touch on the role of the unit in measurement, but the texts do 
little for the children other than provide congruent units, discuss 
appropriate units for specific problems, and teach ways to represent 
the attribute. 

The unit has been treated by research as it has been by curric- 
ulum, that is, incidentally. Major exceptions to this are studies 
by Piaget (1964), by Gal'perin and Georgiev (1969), and by Carpenter 
(1971). 

Piaget 's levels of development of measurement depend greatly on 
the child's facility with the unit. He describes children in Stage I 
and Level IIA as lacking in two understandings: 

In the first place they are ignorant of the compo- 
sition of parts which means that they cannot under- 
stand that the set of cards (units) taken together 
equals the total area B, and that a part of that 
set alone is equal to the total area A, so that 
A < B because the part < the whole. In the second 
place they cannot see that the sections taken to- 
gether equal the intact whole (p. 294). 

Level IIB is a period of change: 

we see the beginnings of a common measure 

the beginnings of transitivity because there is 
better conservation. In other words, the compo- 
sition of parts within a whole and the composi- 
tion of positions and change of position are more 
coordinated (p. 295). 

He divides his third stage irto two levels: 
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At Level IIIA, children use their composite 
common term to compare A and B but in doing so 
they simply count all the cards as if they were 
equal units and ignore their inequality; but 
at Level IIIB they understand the notion of a 
unit and so they take the size of the measuring 
elements into account (p. 295-296). 

Thus to Piaget, measurement presupposes the understanding of the 
unit. However, his classical experiments were highly structured and 
many notions about the unit were not examined. Some of these notions 
were involved in the curriculum developed by and the research of 
Gal'perin and Georgiev (1969). 

They designed fourtee.i individual exercises involving measurement 
tasks for a study with sixty children in a Russian kindergarten. These 
exercises involved measuring with a unit which was sometimes made up 
of parts, realizing the need for a common unit when comparing two 
njeasurements or realizing the need for common units. They were de- 
signed so that the relationship between visual comparisons and com- 
parisons made by measuring could be examined. A posttest revealed that 
kindergarten children taught by traditional Russian methodology showed 
no improvement on such tasks. Gal'perin and Georgiev blamed this lack 
of improvement on the treatment of a unit: 

forming the concept of a urlt as an entity results 
in an orientation that does not allow for the 
application of the unit as a means for measuring 
and counting. Such an orientation leads to direct 
comparison and visually quantitative distinctions 
(p. 194). 

Using the results of this study and an analysis of the existing 
teaching methods, they proposed a series of operations to lead to the 
formation of elementary number concepts. Central to this sequence 
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was the role of the unit. Sixty-eight lessons were developed to re- 
flect this sequence of operations. These lessons were used with 50 
children in the same kindergarten as the initial study. Pretest 
scores on the same exercises were not much different (nothing but 
percentages of correct responses to items is reported) from the initial 
group, but posttest scores revealed that almost every child success- 
fully completed every task. Their interviewing techniques also showed 
that these children had gained many sophisticated (for kindergarten) 
number concepts. 

Although their research showed that it is feasible to teach 
notions about the unit to six and seven year olds and that there may 
be much payoff with number concepts, more important to this study is 
their conception of the role of the unit in measurement and in forming 
:.lementary number concepts. They build a strong logical argument for 
making the unit central to the development of elementary mathematics 
at the same time pointing out pitfalls that must be avoided. 

Carpenter (1971) in his attempt to resolve the conflicting views 
of Piaget and Gal'perin and Georgiev devised 18 tasks for children in 
kindergarten through second grade. Ten of these tasks involved meai-ure- 
menc and the unit. Either visually different units, indistinguish- 
ably (visually) different units, or the same units were used to measure 
the liquids in two containers. Carpenter also varied the true relation 
between the two liquids (equal or not equal) and the initial and tiie 
final visual relation between the two liquids (not seen, visually 
correct or visually deceiving) . He also varied the relationship be- 
tween the unit and the measure in such a way that when the triie re- 

ErJc ^''^"^ °1 ^'^"^^ °2 °1 > °2' "l U2 
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were chosen in one case to make the measures and equal, and in the 
other case to make m^ < m2- Combinations of these variations produced 
the ten tasks. 

In this portion of his study he individually tested 129 first and 
second graders on subsets of these tasks. He concluded that "by the 
end of the first grade, virtually all students realize that the quan- 
tity that measured the most units must be the greatest." However, 

only 70% of the Ss tested were able to use measure- 
ment results if they were followed by conflicting 
visual ones. Only 59% of the S^s tested demonstrated 
any knowledge that variations in unit size affected 
measurement results, and as few as 40% of Ss were 
able to apply this knowledge to problems in which 
quantities were measured with different units. 
This figure dropped to 25% of the Ss when the larger 
unit was not visibly distinguishable, and only 6% 
cf the S^s were able to use results of measurement 
operations to determine the larger unit when it 
was not visually apparent (p. 99). 

In conclusion he recommends that 

If one is really concerned with mastery of measure- 
ment concepts with different units of measure, it 
would seem necessary to provide a wide range of 
experiences that help the child focus on more than 
one immediate dominant dimension. It is important 
for teachers and curriculum developers to know 
when they are providing experiences that can be 
mastered, when they are providing experiences that 
may be learned superficially and when they are 
providing experiences that may be beyond the 
capabilities of many of the children (p, 106). 

This investigator's study is partially addressed to gaining this 
knowledge - in reference to area measurement. 

The relation between measuring length and measuring area . Several 
reasons were given in Chapter IT for selecting the attribute of length 
for the teach-test procedure • One of these was that it was a simpler 
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attribute than area to measure. Logically and mathematically area 
is a more complicated attribute. There is overwhelming evidence that 
pedagogically length has been assumed to be the simpler. In all the 
textbooks reviewed measuring area never precedes measuring length. 
Paige and Jennings (1967) surveyed 39 series and found that most text- 
books had introduced the standard unit of length by the third grade, 
but area measurement is not often introduced until fourth, fifth or 
even sixth grade (p, 356), However, there are few psychological studies 
which lend evidence to support or to deny this assumption. 

Piaget (1964) proposes the same stagewise development for measure- 
ment of both length and area. In replicating his experiments Lovell. 
Healy and Rowland (1962) found the same stagewise development, but 
their research showed that the length stages are reached at an earlier 
age than the area stages, Beilin and Franklin (1962) showed that in 
training 6 and 8 year olds in length and area the older Ss could be 
taught both, but the younger could make progress only in acquiring 
length measurement. 

Although no study could be found which gave direct evidence that 
understanding length measurement would facilitate the understanding of 
area measurement, research and practice does indicate that the con^.erse 
is true. Hence, the investigator felt comfortable in hypothesizing 
that any subject who could not handle length measurement would ha^'e 
difficulty with area measurement. The other hypotheses concerning 
the interaction of students' ability to handle length and area were, 
admittedly, more open to question, 
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The measuring of area . 

The approach to area needs very careful consideration. 
In 1941 The Scottish Council for Educational Research 
published the results of their research into the 
teaching of area. Out of 1000 children 444 decided 
that the definition of area was "the distance all 
around it." Even today they make similar replies. 
The confusion seems to be due to . . . (Marsh, 1969). 

This quote represents a multitude of statements by individual edu 
cators and commissions. However, there are few studies other than con 
servation studies and Piagetian measurement studies (Beilin, 1964; 
Lovell. 1971; Needleman, 1970; concerned with determining the nature 
of this confusion. Beilin (1966) presented first and second graders 
with what he classified as quasiconservation area tasks. In contrast 
to the usual conservation tasks in which the transformation of one 
region is shown, he presented two regions, one of which had already 
been transformed. These tasks were actually measurement tasks in that 
the unit was specified and the subject was allowed to make the com- 
parison between the two regions either by counting (termed iteration 
by Beilin) or by translocating (mentally moving the units of one 
region so that the shapes of the two regions are comparable). Several 
procedures were devised to train the subjects in iteration or trans- 
location. Another experimental group received no training other than 
feedback of the correctness of their decisions on the pretest. Beilin 
found the feedback method was the most effective in producing change; 
thus he concluded that many subjects only needed to be reoriented 
process the perceptual data differentlv. Beilin's study is relevant 
to this investigator's study for three reasons. First, he found that 
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second graders were more successful in area tasks than first graders. 
This gives reason for placement of area study in the second grade 
rather than the first. Second, he found that many more students 
naturally use the translocation method. This seems to indicate that 
the use of concrete materials which allows the student to physically 
make the translocations probably precedes merely counting static units. 
Third, he found that if students were reoriented to handle the percep- 
tual data differently tiiany could perform the tasks successfully. Hence, 
many students at this age seem only to need help in understanding the 
problem posed; it is not a problem beyond their ability. 

Wagman (1969) hypothesized four levels of development of con- 
servation of area similar to Piagefs. She found that children at all 
three age levels, 8, 10 and 11, had attained conservation and recom- 
mended that the study of area concepts should begin in the second or 
third grade. 

The investigator found only two relevant studies which involved 
instruction in area concepts other than those involved related to the 
conservation of area. Luchins' (1949) descriptive study dealt mainly 
with the feasibility of teaching methods of, rather thar formulas for, 
finding the area of triangles, rectangles and parallelograms to sixth 
graders. He also found that some five year olds could handle such 
problems. Luchins' claim that the intuitive approach to finding areas 
was more meaningful than a formula approach supports DMP's approach 
and hence, the approach to area adopted by this investigator. Johnson's 
(1970) study dealt mainly with the effect of varying the amount of 
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concrete experience in teaching perimeter, area and volume formulas to 
fourth, fifth and sixth graders. He found that those exposed to his 
maximum treatment of concrete materials were the most successful. He 
also found that age was a significant factor in learning concepts 
related to area. Children older than 11 years were more successful 
than those younger. His study lends support to Ais investigator's 
decisions not to introduce formulas at the second and third grade level 
and to rely heavily on concrete materials. 

The well acknowledged difficulty which children encounter in 
area measurement combined with the paucity of research Identifies this 
problem as one which requires major investigations. 

Aptituue-treatmeat Interaction Literatu re 

Since Cronbach's 1957 presidential address to the American 
Psychological Association in which he intimated that treatments may 
interact with abilities or personality traits, there has developed a 
suDstant-ial interest in aptitude-treatment interaction research. 
Cronbach and Snow (1969) in a comprehensive report on individual dif- 
ferences reviewed and critiqued many ATI studies and studies which 
lend t.Vjmselve. to ATI analysis. They concluded that most ATI studies 
have failed to produce interaction and contributed this to concep- 
tualization problems, inappropriate analyses and the early state of 
the art. Although results to date have been discouraging, they urged 
against a defeatist attitude and suggested ways to attack some of the 
problems • 
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Bracht (1969) categorized 90 ATI studies according to the type of 
treatment, aptitude, dependent variable and interaction. In consider- 
ing 108 of the possible interactions he found only five attained the 
goal of ATI, that is, had disordinal interactions. His definition of 
disordinal interaction is more stringent than Lubin's. Lubin (1961) 
only required that the regression lines of a significant interaction 
cross within the range of aptitude. Bracht added the restriction that 
the differences between the alternative treatments for at least two 
levels of aptitude must be significantly non-zero and must differ in 
algebraic sign. None of the five studies which he classified as dis- 
ordinal were conducted in a classroom environment. The treatments 
were brief and controlled; the aptitudes included personality traits, 
abilities and social class; and the dependent variables were very 
specific tasks. 

Subjects of all ages were used in the 90 studies reported. The 
number of subjects varied greatly from small studies of 30-40 subjects 
to ones with 700 or more. The five studies with disordinal inter- 
actions had a Similar age span, but the average number of subjects was 
about 100. Thus, f.om this summary this investigator gained no direc- 
tion as to age or to number of subjects. 

A controlled treatment was defined by Bracht to be one in which 
"the degree of attainment of the treatment objectives was largely con- 
trolled by the presentation of specific and prescribed treatment tasks 
and little opportunity existed for the subjects to be influenced by 
other external conditions" (Bracht, 1970, p. 628). Eighty-five of the 
108 studies including all five with disordinal interactions were 
^ 
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controlled. Many of these involved basic learning tasks or rol.-itlvoly 
short self-inscructional, often computer oriented, units. Mathe- 
matical concept.s were often used in the basic learning tasks, but lour 
controlled studies (Becker, 1967; Behr, 1570; Carry, 1968; and Davis, 
1968) investigated the learning of mathematics. These were all self- 
instructional units for secondary students; therefore, no further review 
of them is included here. 

Two uncontrolled treatments involved mathematical instruction of 
third and fourth graders. Lucow (1964) examined the interaction of 
three levels of ability (IQ) with two six week treatments of multipli- 
cation and division, one with Cuisenaire rods and one without them. 
Using gain scores of 254 third graders on a multiplication and division 
test as a criterion, no interaction was found. Anderson (1949) foun,' 
some tendencies toward ordinal interaction In looking at drill ver.sus 
meaningful methods of teaching arithmetic to fourth graders. However, 
the distinction between the two methods and the teacher variable of 
the 18 classes makes one skeptical of even the tendencies. 

This analysis of the- treatments indicated that it was probably 
best to control the treatments as much as possible, even in a class- 
room setting for primary children. A warning from Cronbach and Snow, 
"educational policy cannot be based on what the pupil does with his 
first encounter with an instructional style," influenced the length 
of the treatments in this study. A brief treatment like many of tlie 
ones reviewed did not appear to be appropriate. Otherwise, this 
literature on treatments had little relevance to this investigator's 
study. 
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Aptitudes, or persono.logical variables as. they are referred to by 
Bracht, were classified according to whether they were factorially 
simple or complex* ''Measures of specific abilities, interests, atti- 
tudes, personality traits, and social, economic and educational status 
were classified as factorially simple" (Bracht, 1970, p. 284). Measures 
ot cognitive ability and achievement were considered complex. Most of 
the studies used general abilities such as IQ, but many used previous 
achievement. Specific abilities determined by tests from Guilford's 
Structure-of-Intellect battery or other similar tests constituted most 
of the factorially simple variables. Personality measures and other 
similar factorially simple variables were not often used. 

No study was found which defined aptitude as the measure of th^' 
ability to learn. The only study which established aptitude by con- 
sidering a task similar to the treatment task was a paired-associate 
study by Davidson (1964). Davidson determined his two ability groups 
on the basis of a pre-experiment paired-associate task similar to one 
of his five treatments. He found no overall interaction, but there was 
a tendency toward a significant interaction between two of his treat- 
ments and the aptitu^ . The lack of cell size would not permit a post 
hoc analyses of this interaction. 

Thus, the studies reviewed lent little direction to the develop- 
ment of the aptitude measure used in this study. This direction came 
from the teach-test literature which is reviewed in the next section. 

The dependent variable was usually specific; in fact, it was 
usually immediate achievement. One of Bracht 's recommendations for 
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further ATI investigations is to use other measures such as transf'-r 
and retention. This advice was followec in designing this study. Past 
ATI studies have not proven to be successful in producing the desired 
interactions. However, .hey, along with critiques such as Cronbach 
and Snow (1969) and summaries such as Bracht, provide valuable assis- 
tance in planning further »uch studies, 

Teach-test Literature 

The motivation to use a teach-test procedure was derived from the 
work of Ralph T* Heimer, As far as the investigator could determine, 
Heimer and his associates* work (Heimer, 1966; '.leimer and Lottes, 
1968) constitutes the entire literature on this procedure. 

Heimer describes in the American Mathematical Monthly , October 
1966, the procedure and three studies. The first study was conducted 
with 106 entering college freshmen to ascertain the contribution this 
procedure could make with respect to predicting success in college ma':he- 
matics courses. Although the results were inconclusive they showed that 
a high teach-test score corresponded to a high CMT (EST Cooperative 
Math-matics Test) but not vice versa. Concluding that "Teach-test 
apparently was discriminating among students at higher levels" (p. 885) 
Heimer was prompted in the sunner of 1965 to conduct two parallel stud- 
ies, one at Florida State University and the other at Florida A & M 
University. 

Both of these scudies were conducted in connection with the Sec- 
o.dary Scientific Training Program (SSTP) , a summer program for talented 
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high school students. Twenty-three .students aL tho heKlnning of tl.c 
program at FSU were administered the teach-test package as well as 
Cooperative School and College Ability Test (SCAT), Form UA and the 
Cooperative Reading Test, Reading Comprehension, Form 1. At the end 
of the summer the instructors rated the students on a 14 point scale. 
Correlational analysis revealed that the T-T score was the highest cor- 
relate ot the scale, although not significantly different from SCAT 
Quantitative score. It was an especially good predictor for the top 
students. The criterion used at Florida A & M was the rating of 1 to 
25 of the twenty-five students involved. Again, teach-test procedure- 
was good for predicting the exceptionally high students. 

A more extensive study was conducted in the summer of 1967. Seven 
colleges offering SSTP courses participated in this tryout of teach- 
test packages. This involved 259 students and 14 courses. Beside the 
teach-test scores, scores on SCAT, II and IV (Insightful Computation anH 
Mathematical Reasoning) and on five subtests of Personal Value Inventory 
(PVI) were attained. Two criterion scores, course grade and instructor's 
rating, were used in the analysis. 

Five major hypotheses were investigated: 

1. If the TEACH-TEST procedure is compared with con- 
ventional procedures for predicting scholastic 
success in the study of mathematics, then the 
TEACH-TEST correlation coefficient will be 
highest . 

2. The TEACH-TEST procedure will measure on factors 
not taken into account by conventional procedures 
for predicting scholastic success. 

3. The TEACH- TECT procedure will discriminate more 
effectively at the high-success levels than will 
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conventional procedures for predicting scholastic 
success . 

4. If the TEACH-TEST procedure and conventional pro- 
cedure for predicting scholastic success are 
compared on differences in effectiveness at 
different levels of prior educational opportunity, 
then the differences in effectiveness will 
increasingly favor TEACH-TEST as the level 

of prior educational opportunity decreases. 

5. If the orrespondence between TEACH-TEST con- 
tent and criterion content increases, then 

the predictive effectiveness of TEACH-TEST will 
increase. (Heimer and Lottes, 1968, p. 9) 

Only the second hypothesis was supported by the analysis. How- 
ever, there was some evidence to support the firsi hypothesis for 
some of the programs. Thare was not enough evidence to seriously test 
the fifth hypothesis, but there were trends which seemed to indicate 
its plausibility. 

Thus, the most relevant information gained from these studies 
was the procedure itself. However, hypotheses 1, 2, 3, and 5 have 
some relation to the study or to recommendations for further study. 

Although the first hypothesis was not supported there were some 
indications that if the course was closely related to the teach-test 
instructional unit the hypothesis was more plausible. The teach-test 
instructional unit for this study was constructed to be closely re- 
lated to the treatments. 

The second hypothesis indicates that the teach-test procedure 
measures factors not taken into account by conventional predictors. 
An interesting question to be pursued is -hether a combination of such 
predictors would be a more valid criterion for determining aptitudes. 
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Heimer strongly suggests in all his studies that this procedure 
is more effective for predicting success of top students. By rejecting 
his hypothesis he failed to show that it was a more effective dis- 
criminator than conventional predictors at the top fiftn level or the 
second fifth level. Although this investigator's study cannot test 
this hypothesis directlv, its data could easily be used to test the 
hypothesis that the teach-test procedure was a better predictor for the 
upper level of aptitude than for the lower level of aptitude. 

In the recommendations for further study, Heimer and Lottes make 
a plea for the tryouts of teach-test packages which closely correspond 
to the content of the course. This study provides the opportunity to 
reexamine their fifth hypothesis by testing whether the teach-test pro- 
cedure was a better predictor for Treatment U than for Treatment N. 

Although Heimer' s work is inconclusive and there is danger in gen- 
eralizing any of his findings, it did provide the motivation and guide- 
lines for using the teach-test procedure in this study. 

Summary 

On the one hand, the search for relevant literature did not prove 
to be very fruitful or encouraging. The research on measuring was 
scanty and often not directly relevant to this study. The ATI research 
results were discouraging. The research utilizing the teach-test pro- 
cedure was of a neophyte nature. 

On the other hand, this very lack of research and of positive 
results coupled with the conviction of mathematics educators that 
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measurement is an important topic and the support of some prominent 
educational researchers for the pursuit of ATI studies, especially 
those which define aptitudes in new ways, gave encouragement to the 
investigator. 

Thus, with whatever knowledge that could be gleaned from past 
research and experience, the investigator proceeded to plan the study, 
a description of which follows in the next chapter. 



Chapter IV 
DEVELOPMENT OF THE STUDY 



ERIC 



Introduction 

This chapter gives a detailed description of the development of 
the study. First, the development of the experiment is detailed and 
then the research strategy used in conjunction with this experiment 
is described. 

The instructional treatments, the teach-test treatment and the 
observation instruments used in this experiment were developed in 
accordance with the first two phases of the curriculum development 
model of Romberg and DeVault (1967). These phases, analysis and 
pilot, are iterative. That is, first both mathematical and instruc- 
tional analyses are made of the curriculum to be developed. Then 
the curriculum is developed and piloted. After analyzing the pilot 
there are two choices: (1) if the results of the pilot were satis- 
factory one proceeds to the next phase, validation, and (2) if the 
results are not satisfactory om recycles through the first two phases. 

Figure 4.1 shows the cycle of analysis and pilot phases for this 
experiment. The steps have been grouped into three categories, de- 
velopment of the problem^ development of the teach-test procedure 
and development of the treatments, to facilitate the discussion which 
follows . 
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Development of the Problem (Steps 1-4) 

This section describes the development of the problem by describ- 
ing the original problem, the pilot studies conducted to investigate 
this problem, and their influence both on the formulation of the final 
problem and on the treatments used in the final study. 

Step 1; Formulation o f the original problem . The formulation of 
the original problem was in response to the curriculum development des- 
cribed in Chapter n. Originally, the problem was to investigate the 
relative effectiveness of two instructional treatments of area; one 
reflected the DMP approach (Treatment B) and the other reflected a 
more traditional approach (Treatment A) . No specific hypotheses were 
proposed; the purpose of the first pilot studies was to informally 
investigate this problem. 

Step 2: Analysis for the Treatments A and B . Even for an informal 
investigation both a mathematical analysis of the content and an instruc- 
tional analysis must be made. These analyses were made but are not re- 
ported here since the mathematical analysis is described in Chapt. r II 
and the instructional analysis is reflected in the description of tha 
pilot studies which follows. 

^^^P ^- Pilot studies of Treatments A and B . in the spring of 

1971 both Treatment A and B were piloted with a small number of child- 
dren at Randall Elementary School, one of the DMP developmental schools. 

Treatment B was designed to reflect an attr ibute-by-proccss approach, 
That is, an attribute is identified and processes arc exLendc-'l or -ie- 
veloped with that attribute. In this case, the attribute of area was 
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identified and the processes of comparing, ordering and measuring 
were introduced sequentially. Activities were developed which in- 
volved superimposing, exact and approximate coverings, cutting and 
rearranging. Nine 25-minute lessons and an individually administered 
test was developed. Six first graders from several classrooms parti- 
cipated; however, due to illness only four were present for the entire 
treatment. The small sample gave the investigator an opportunity to 
ask many probing questions. Although many of the activities were suit- 
able for first graders it was felt that too many inferences were ex- 
pected for these children. 

Treatment A developed area in a way similar to many traditional 
textbook series. Area was defined as a number. The children were 
asked to assign a measure, by counting, to a region whose covering 
with units was shown. Using numbers assigned in this method, the chil- 
dren were asked to compare and order the regions on area. Thus, area 
was always considered as a number; no direct comparing, cutting or re- 
arranging was ever required. Four second graders participated in this 
treatment which was planned for five days but was completed in three. 

Although both groups were formally tested the information gained 
from the testing was considered as only another observation of the 
children and their responses. From the two treatments and tests many 
subjective decisions were reached: 

First, the approach taken in Treatment B was preferable to that 
in Treatment A. The investigator felt that those in Treatment A were 
only manipulating numbers with no reference to area; this feeling was 
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confirmed by the test in which the children were asked first to 
visually compare pairs of regions and then to compare the same regions 
covered by pieces in such a way that the comparison of the number of 
pieces did not correspond to the comparison of the area. They were 
able to do the former but not the latter task. 

Second, many instructional problems or successes were evident 
which needed to be taken into account when designing the treatments 
for the final study. 

Th^rd, first grade was too early to expect much success in a 
treatment which went as far as Treatment B; although many of the ear- 
lier activities are suitable for this age group. Thus, the study 
should be conducted with an older group of children. 

Fourth, it was possible to design a group test for second graders; 
however, the format needed to be simplified. 

Finally, the investigator felt a more interesting and important 
question became evident. In both groups the lack of understanding of 
the uni-t of area was a problem for some children but not for cvthers. 
What could be done, if anything, for those children who lack this under- 
standing? mat is best for those children who already have this under- 
standing? 

Step 4; Reformulation of the problem. In summary, this pilot 
study not only gave the investigator more insight into the pedagogical 
problems, but also changed the direction of the study. Although the 
relative effectiveness of these two treatments should be investigated, 
both treatments depend upon the child *s understanding of the unit. 
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Therefore, before looking at the contrast between these treatments 
the investigator decided to contrast two treatments by their emphasis 
on the unit. Using Treatment B as a basis, two treatments would be de- 
veloped; one of which would stress the unit and the other which would 
not stress the unit. This involved reconceptualization of the prob- 
lem and led to the formulation of the problem as described in full in 
Chapter ii. 

The question asked was: How do children's abilities to learn 
about a unit of length interact with two treatments on measuring area? 
A teach-test procedure was used in order to determine the children's 
ability to learn. The development of this procedure is described next. 
Afterwards the development of the treatments is given. 

Development of the Teach-Test Procedure (Steps 5a - 10a) 

The teach-test procedure and the treatments were planned together 
so that the desired interaction could be taken into account. However, 
for ease and clarity of description the development of the teaci.-test 
procedure is discussed first. As stated in Chapter ii the decision 
was made to teach about the unit of length in the teach-test procedure. 
The rationale for that decision was presented there; this section de- 
scribes the evolvement from the first pilot study of this procedure to 
the final form of this procedure. 

^^^P Analysis for teach-test procedure . There appear to be 

two prominent misunderstandings about a unit of measure: the need to 
use congruent units when assigning a measure and the need to use common 
units when comparing two regions. The first pilot study of the teach- 
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test procedure focused on both of these needs. A test and two lessons 
were developed to test and teach the following objectives: 

1. Given a length the student indicates that it must 
be represented with congruent units in order to 
assign a measure to it. 

2. Given two lengths whose measurements have been ex- 
pressed in terms of different units, the student 
indicates that the lengths cannot easily be compared 
unless the same unit is used to represent both lengths. 

The first lesson presented the children with the problem of repre- 
senting lengths with congruent units as a communication problem: "How 
do you tell someone else the length of an object if non-congruent units 
represent it?" After a group discussion the chi.^rsn worked in small 
groups and individually on similar problems. 

The second lesson presented the problem of comparing two lengths 
when non-common units have been used to measure the lengths. Again tl,e 
problem was posed in a communication context. In the second part, the 
children had a contest in which each measured an object with congruent 
units and then they tried to compare their length with an opponent's 
length which may have been measured by a different unit. In the third 
part they responded to similar comparison questions on a worksheet. 

Step 6: Pilot Study //I of the teach-teat proced ure. The investi- 
gator along with the teacher who participated in the final study tried 
this version of the teach-test procedure with 23 third graders at 
Randall Elementary School. On the first day a pretest was given and 
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Lesson 1 was covorc'd. On the second !.. , Lesson ? w.i.s covm-d . A |)ost- 
test was scheduled for the end of the second day, but the second lesson 
lasted longer than the forty minutes allotted, so the posttest was given 
the third day. 

Step 7: Reanalysls #1 for the teach-test procedure . Many obser- 
vations were made during the first pilot study. The first lesson did 
not go well; the children did not see the need to represent the length 
with congruent units. They were perfectly happy to describe the repre- 
sentation more fully by saying, for example, three short units and four 
long units. In the next lesson when they wore asked to compare lengths, 
they then saw a need for congruent units as well as common units. Thus, 
it was decided to de-emphasize the first objective, and to drop tht- first 
lesson. The contest in the second lesson created much enthusiasm but 
was unmanageable. It was decided to save such activities until the main 
treatment when the teacher knew the pupils better. Problems with work- 
sheets of the second lesson were found. Many children were ignoring the 
measurements and relying on the visual stimuli, although the stimuli 
were often deceiving. Too much was planned to be covered in two days; 
this resulted in two changes, a shortened treatment and better organi- 
zation so that no time would be wasted in management. The children 
appeared capable of handling simple ratio relationships between units. 
This capability was utilized more in the next pilot study. There 
were positive aspects of the two days which were retained in the sub- 
sequent tryout. 

The test which was used for both the pretest and the posttest con- 
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sisted of 23 items. Three items tested prerequisite behaviors and 
the other items tested the two objectives. Two difficulties were noted 
With the test. First, it was too easy; this vas partly due to always 
showing the objects whose lengths were to be compared. The lengths 
could always be compared visually cr by ut.ng a pencil to represent 
one length and checking this against the other. Thus, there was no 
need to rely on the measurements. Second, the use of the same test 
for both the pretest and posttest caused both motivational problems 
("I knew this before, why do I have to show you again.") and measure- 
ment problems ("I remembered I answered M on this before.") Otherwise, 
except for minor exceptions the test was clear and the length was ap- 
propriate. The results of the tests showed that most of the chil- 
dren had mastered the objectives both on the pretest and the posttest. 
The investigator felt this was an artifact of the test since the chil- 
dren did not have to rely on the measurements. 

Step 8; Pilot Study //2 o f the teach-test procedur e. The teach- 
test procedure was modified in accordance with the findings of the 
first pilot and piloted again. This time, twenty-one second graders 
at Randall E ementary School participated. On the first day, after 
the pretest, the first lesson presented was comparing lengths thai had 
been represented with different units. On the second day, the chxl • 
dren worked more independently with problems of comparing and onierin,", 
lengths which were represented with different units. 

Step 9: Reanalysis //2 fo r the teach-test procedure . AUhouyj, Li.e 
second pilot study went more smoothly, several changes were made for 
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the final study. The ambiguous questions on the worksheets were elimi- 
nated and the format simplified. The test items which were unclear and 



the comparison could be made* The second graders had some difficulty 
in following the complex directions and these changes enabled rhc di- 
rections to be much more straightforward. 

The results of the pretest and posttest indicated a positive change 
for about half of the children over the treatment. A few children had 
mastered the objectives on the pretest; many more rad on the poslLesl. 
The other half showed no change in performance. 

S tep IQa: The final plan for the teach-test proce dure. As a 
result of the pilot studies and subsequent reanalyses the final tcuii- 
test procedure was designed. 

Two objectives were specified: 

1. Given two lengths the student indicates that he 
must know both the number and the unit before he 
can compare and order the lengths. 

2. Given two lengths whose measurements have been 
expressed in different units, the student compares 
and orc^ers the lengths by taking into account Lho 
relationship between the units. 

The lesson plans may be found in i^ppendix A. Essentially, t'le 
two lessons remained the same ,is they were for the second pilot. Ti^e 
first lessen was planned to last twenty minutes and to be given irnme- 
iately following the pretest. Tt is a large group activity which 



the items which required a "cannot tell" response were changed so that 
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introduces the difficulty of comparing lengths that have been repre- 
sented with different units. This is done hv r-^ 

inis IS aone by representing two lengths Lj 

and with units and , respectively, and with the measures and v^. 
respectively, in the following ways: 

1. Given > L^, 

(^•-[^ and U2 are chosen so that U, •' L' and M. = M 

2. Given > L^, 

and are chosen so thai I . > n and M <- M 

J. 2 "1 "2' 

and 

3. Given ^ '^2' 

and are chosen so that = and > . 

The second lesson was planned to last forty minutes, m the first 
half the children work individually comparing and ordering two lengths. 
Each child receives a packet of strips which are color coded with r, - 
spect to length. These help hi. answer questions presented on a wo,..- 
sheet about the order of the lengths of, say, 5 red strips a.d 7 bl.e 
strips. The questions are structured so that the measures alone do not 
always specify the order relation between the two lengths. The seconc 
half Of the lesson is designed as a large group activity in which the 
children are shown tw .neasurements without the visual stimuli of the 
two lengths. They are asked to compare and order the lengths and to 
give reasons for their decisions. The two instruments 0, and 0^ de- 
veloped for the teach-test procedure may be found in Appendix Th,- 
pretest 0^ consisted of 22 items; the first two items tested pre- 
requisite behaviors. These items were used to el-minate any student 
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who did possess these behaviors, but not in any further analysis. 
The remaining twenty items tested the two objectives of the teach- 
test. The items varied the type of stimuli; combinations of the 
following possibilities were made. The two lengths were either vis- 
ually comparable, visually misleading, or not shown; the un.cs repre- 
senting the two lengths were either the same or different; and the 
relation between measures was either in the same or different order 
as the actual lengths. Thus, the student could not depend upon visual 
comparison or numerical comparison, but was forced in most instances 
to coordinate the measure and the unit. The posttest 0^ consisted of 
items from the pretest arranged in a different sequence and denoted 
with different labels. 



Development of Treatments (Steps 5b, 10b) 

The two treatments were developed concurrently with each other and 
with the treatmenc for the teach-test procedure. There were two mnin 
steps in this development. First, an extensive analysis, both mathe- 
matical and instructional, was made for the treatments. The ins^ruc- 
tional ana^vsis included a reanalysis of the pilot study of Treatment B. 
Secondly, the specific lesson plans and testing instruments were de- 
veloped. This section describes both of these steps. 

Ste£_3b^ Anal ysis for the treatments . The first part of the anal- 
ysis was a m..t:.ematical analysis of the process of measuring as u re- 
lated to ,,rea. This analysis which was described in Chapter ri re- 
sulted in the selection of the eight bchaviorn] objertiw.s listed in 
Table 4.1. 
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Table 4,1 

BEHAVIORAL OBJECTIVES FOR TREATMEiNTS 
U AND N 



1. Given a region, the student identifies area as an attribute of 
that region. 

2. Given two regions, the student visually compares and orders the 
on the attribute of area, 

3. Given two regions, the student directly compares and orders them 
on the attribute of area, 

4. Given a region, the student physically represents the area with 
discrete objects. 

5. Given a region, the student symbolically represents the area with 
a number and a unit (measurement). 

6. Given the measurements of the areas of two regions, the student 
compares and orders them using their measurements. 

7. Given a region, the student symbolically represents che area with 
a number (measure). 

8. Given the measures of the areas of two regions, the student conipar- 
and orders them using their measures. 



Each treatment group was to be seqttenced through these eight be- 
haviors. However, some of the objectives are interpreted differentiy 
for the two treatment groups. The first three objectives do not ex- 
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plicitly involve a unit, thus they were treated the same across 
groups. Differences between the treatments are reflected in Obiectives 
4 and 5. A region may be represented either with congruent units or 
non-congruent units. Subjects in Treatment N represented regions only 
with congruent units, but subjects in Treatment U also used non- 
congruent units. Thus, subjects in Treatment N had less reason than 
those in Treatment U to record the unit used in representing. Objec- 
tive 6 was also treated differently across groups. Treatment N's sub- 
jects were presented regions which they covered or which weie already 
covered with common units, i.e., both regions were covered with the 
same unit. Treatment U's subjects were asked to compare and order 
regions which were not covered with common units. Thus, for Treat- 
ment N's subjects there was little reason to focus on the unit when 
comparing, for the number would always suffice. Both treatments were 
to teach toward objectives 7 and 8 through objectives 5 and 6, respect- 
ively. Subjects in both treatments were presented with the same types 
of simple regions and the same type of units. All measures in both 
groups were assigned only by counting; no formulas were introduced. 

The next step in the analysis of the treatments was the instruc- 
tional analysis. In this process the interactions of the learner with 
the mathematical content must be carefully considered. Since there are 
no theories of instruction which lend adequate guidance to this pro- 
cess, the investigator resorted to the knowledge gained from the pilot 
studies of Treatments A and B and to the expertise of those who have 
develooed instructional activities for VMP . 
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Several decisions wer-b r-^ached commci. to both treatments. First, 
both treatments were to reflect the instructional mode of DMP. That 
is, both would center on an activity approach which uses physical ma- 
terials and problem solving whenever feasible. Secondly, the se- 
quencing of activities would be similar. The first three days would 
be spent acquainting the students with the attribute of area, with the 
units used in measuring and with activity learning. The next three 
days would be spent mainly on representing area and the last three days 
on comparing and ordering areas using their measurements or measures. 

Next, each behavioral objective was further analyzed in terms of 
pedagogical considerations. Instructional objectives for each treat- 
ment were written which taught toward the behavioral objectives. Table 
4.2 lists these objectives and keys them to the behavioral objectives 
listed in Table 4.1. 

Step IQb; The final plan for the treatments. From these instruc- 
tional objectives nine days of activities were planned for each group. 
Each day's session was to be forty minutes long; because flexibility 
was desirable more activities than were considered necessary for each 
forty minutes were planned. Each day's plan included the behavioral 
objectives, the instructional objectives, the materials, the organi- 
zation and a description of the activity. These, as well as the chil- 
dren's activity sheets and a daily journal, are found in Appendix C. 
The purpose of each day's activities and the contrast between treai- 
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Table 4.2 

INSTRUCTIONAL OBJECTIVES FOR TREA'fMENTS U AND N 



Treatment U 






Treatment N 


Instructional 


Behavioral 




Instructional Behavioral 


Objectives 


Objectives 




Objectives Objectives 


^ - a 

U.l:l To introduce the 


1 


N.l 


:1 Tc introduce the 1 


attribute of 






attribute of 


area. 






area. 


U.l:2 To compare and 


2 


N.l 


:2 To compare and 2 


order regions vi- 






order regions vi- 


sually on area. 






sually on area. 


U,l:3 To introduce super 


3 


N.l 


:3 To introduce super- '3 


position as a 






position as a 


method of compar- 






method of con.par- 


ing and ordering 






ing and ordering 


regions on area. 






regions on area. 


U.2:l* To see that a re- 




N.2 


:1* To see that a region 4 


gi )P consists of 






may consist of con- 


parts. 






gruent parts. 


U.2:2 To give experience 




N.2 


2 To give experience 4 


with the shapes 






with the shapes 


which are to be 






which are to be 


used as units of 






used as units of 


area. 






area. 


U.2:3* To introduce the 


5,6 






relationships be- 








tween the units of 








area. 








U.3:l* To introduce phys- 


4,6 


N.3: 


1* To introduce phys- 4,6 


ically represent- 






ically represent- 


ing area as a method 




ing area with con- 


of comparing and 






gruent units as a 


ordering regions. 






method of comparing 








and ordering regions. 
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Table 4.2 (continued) 



Treatment U 






Treatment N 




Instructional Behavioral 




Instructional Bch 


c» vioral 


Objectives Objectives 




Objectives Objectives 


IT 0 • T 'J 

U.JiZ^ To consider measure- 




N.3 


To practice physi- 


^ 


menLs or cne area 






cally representing 




of an object repre- 






with congruent units 




sented by different 






the area of objects 




units . 






in the room. 




U.4:l* To practice 


4 


N 


. H- . X io pracLice 




ohvs irallv re— 






physically re- 










presenting the 




area of a rpcyion 






area of a region 










wiLO congruenE 










un its. 




U.4:2* To introduce sym- 


5,7 


IN 


.4:2* To introduce sym- 


5,7 


bol! Pril 1 V T*PnT*P — 






bolically repre- 




senting the area 






senting the area 




of a recion vhi rh 






or a region wiiicn 




has been covered 






tidb Uccil COVSlSQ 




wi th p 1 ^ h p T* r nn — 






wiun congruenc 




sruent or non— rnn— 






un its. 




eruent units - 










U.5:l* To introduce choos- 


4 




5:1* To introduce 


4 


ing an appropriate 






regions which can- 




unit to represent 






not be completely 




the area of a 






covered with a 




region. 






given unit. 




U.5:2 To practice sym- 


5,7 


N. 


5:2 To practice sm- 


5,7 


bolically repre- 






bolically repre- 


senting the area 






senting the area 




of a region. 






of a region. 




U.6:l* To introduce 


4 


N. 


6:1* To practice assign- 


5,7 


regions which 






ing measurements 




cannot be com- 






to regions which 




pletely covered 






cannot be covered 




with a given unit 






completely with a 




or a mixture of 






given unit. 




units . 
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Treatment U 



Treatment N 



Instructional 
Objectives 



Behavioral 
Objectives 



Instructional 
Objectives 



Behavioral 
Objectives 



U.6:2 To practice sym- 
bolically repre- 
senting the area 
of a region. 



U.7:l* To compare and 
order regions 
whose area measure- 
ments are expressed 
in terms of either 
common or non- 
common units. 



U,8:l* To practice 
approximat- 
ing the area 
of a region 
when it is 
not covered 
exactly. 

U.8:2* To compare and 
order regions 
on area whose 
measurements 
have been ap- 
proximated. 



U.9:l To practice com- 
paring and order- 
ing regions on 
area using 
their measure- 
ments . 



5,7 



6,8 



5,7 



6,8 



6,8 



N.7:l* To corpare and 6,8 
order regions 
whose area measure- 
ments are expressed 
in terras of a com- 
mon unit . 



N.8:l* To practice 6,8 
comparing and 
ordering the 
areas of two 
regions on area 
by using 1/2" 
square grids . 



N.9;l To practice com- 
paring and order- 
ing regions on 
area using their 
measurements . 



6,8 



a) 



U.l:l refers to the Treatment U, day 1, instructional objective 1- 
similar notation. 

Indicates that the objective was not common to both treatments. 
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ments are sunmarized here. 

Day 1 : The purposes and the activities for both treatments aiv 

the same. Besides the three instructional objectives (to introduce the 
attribute of area, to compare and order area visually, and to intro- 
duce superposition as a method of comparing and ordering regions) the 
purposes of this lesson are to have the children verbalize as much as 
possible and to have the children begin solving problems on their own. 
For example, after comparing regions visually, the children are pre- 
sented two regions which may be visually deceptive to them. They are 
to find a way to compare the regions or to verify their decision about 
the comparison. 

Day 2 ; Although one purpose of this les on is to acquaint lli^- 
children with che plastic pieces they are to use for units, it is ex- 
pected some would see that a region is made up of many parts and that 
the same pieces make many different regions. 

In Treatment N the children use congruent pieces to make a larger 
region and transfer the region to paper tesselated with that piece 
In Treatment U the children also use non-congruent units to make a 
larger region and transfer the region lo paper tesselated with one of 
the units. For example, they make a design with squares and rl^\h\ tri- 
angles and transfer the design to a paper tesselated with squares. Or 
they make a design with rhombuses and transfer it to equilateral paper. 

Day 3 ; Both groups are given the problem of comparixig the area 
of two objects in the room which cannot be superimposed. Through ti.is 
problem the need for physically representing the area by covering is 
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introduced. In Treatment N the children find the area of objects 
in the room by representing them with congruent units. The children 
in Treatment U do the same activity except that two different measure- 
ments of each object are found. The procedure of covering is refined 
in subsequent activities. 

Dax^^r The purpose of this activity is to practice representing 
regions with discrete units and to assign measurements to the regions. 
In Treatment N the children cover only with congruent units and write 
the number of units required to cover the region. In Treatment U the 
children also cover with non-congruent units and write the measurement 
in terms of all units used or they look at the relationship between 
the units and write the measurement in terms of one unit. Both groups 
are presented regions which may be covered exactly or which are covered 
exactly with a given unit. 

Da}^: Both treatment groups aie presented with regions that can- 
not be completely covered by a given unit. However, in Treatment U 
the children recover with a unit or units that cover exactly. 

Da^: Again both treatments receive practice in assigning mea- 
surements to regions which cannot be covered exactly and in estimaling 
areas. Treatment U's students cover the same region with different 
units. In so doing they are led to see that a smaller unit may cover 
more exactly. Treatment N's students cover the same regions but only 
use one unit. 

Da;^: Both treatments compare and order regions on the basis of 
their measurements. In Treatment N the coverings are presented or the 
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children cover both regions with common units. In Treatment U the 
coverings may be with non-common units, thus the children must take 
into account the relationships between the units to make the comparisons, 
Da;^: Again the basic purpose of this day's activities is to com- 
pare and order areas. Treatment N introduces a one-half square inch 
grid to assist in making the comparisons. Treatment U emphasizes Lhe 
need to approximate areas of regions that cannot be covered exactly. 

Da^: This day is a summary of previous activities in which both 
groups receive additional practice in comparing and ordering areas. 

To complete the development of the treatments instruments were 
constructed by the investigator to measure the three dependent vari- 
ables. These instruments may be found in Appendix D. 

The same instrument was constructed for both the achievement 
measure (O3) and the retention measure (0^). It consisted of 30 items; 
six tested assigning measurements to regions, two tested vi ual order- 
in. of regions, four tested ordering regions when only the measnrements 
were given, and the remainder tested ordering two regions after assip^n- 
ing a measurement to each. Out of the 22 items which tested comparing 
and ordering with measurements, eight were covered with congruent units. 
The remaining 14 required coord xnation of both the unit and the measure, 
or expressing the measurement of the region in terms of one unit. 
Regions which were covered exactly and those which were not covered ex- 
actly were used as stimuli. 

The transfer measure (O^) consisted of 25 items. Ten of the items 
involved area and five involved length. These fifteen items all asked 
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questions about tl.o unit which wore .ore compi icu.d than U.os. ..f 
teach-test procedure or the treatments. The remaining ten items w.ro 
divided between the attribute of capacity and of numerousness . All 
questions dealt directly with the unit of measurement. Thus, the trans- 
fer test was an extremely heterogeneous one. 

The Research Strategy 

This section includes the research design used in the study, the 
population, the hypotheses and the proposed statistical analysis. 

iii-arcl^^isign. As stated in Chapter II the problem was posed 
in terms of an aptitude-treatment interaction question. The ba.ic 
research design for an aptitude-treatment interaction study is one in 
Which the identified aptitude is measured, the subjects are assigned 
to a treatment, and the outcome is measured (Cronbach and Snow, 1969, 
p. 21). This design was adapted for this study in the following manner: 
1. The identified aptitude is measured by a teach- 
test procedure which includes a pretest (0^), a 
biief instructional unit (t-t) , and a posttest (0^) . 
Three levels of aptitude. Levels I, II, and III, are 
determined, according to the following specifications: 
Level I consists of those students who are at the 
nonmastery level on both tests. Level II consists 
of those students who have not attained mastery o:i 
the pretest but have attained maste.y on the post- 
test. Level III consists of those students who are 
at the mastery levels on both 0^ and 0^ . 



68 



2. Subjects from each of Levels I, II, and III are randomly 
assigned either to Instructional TreatmGnt N (N) or lo 
Instructional Trea::Tnent U (U) or they are randomly ex- 
cluded from further treatment. This is done in such a 
way that each instructional group is assigned m, n, and 
p students from Levels I, II, and III, respectively. 
Subjects who do not fit either Level I, II, or III, i.e., 
those who attained mastery on 0^ and not on 0^ are to 
be excluded. 

3. The outcomes include imiacdiate achievement and transfer 
measures from observations 0^ and 0^, respectively, and 



retention measures from 0, 
The design is summarized in Figure A. 2. 



Teach-Test 
Procedure 
(Aptitude 
measured) 



I 



. Lev^l I 

I \ ^ 

I ^ 



I 



^ ^ / 

Lev^l II <^ ^ /\ 

I 



Lev^l III ^ 



Determination i Assignment to 
of Aptitude i Instructional 



Levels 



I Treatments 



U() 0 
3 /. 



Instructional 
Treatnic ..l s 
(Outcomes 
measured) 



Figure 4.2. Research Desim 
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Thus, t'-.e Independent variables are the reach-teGC tr^atineni 
(TT) and the instructional treatments (N and U) . The depena-nt vari- 
ables are the observations 0^ - 0^. Observations 0^ and 0^ arc two 
forms of the same instrument which differ only in the order of ques- 
tions and labels of stimuli. One instrument is used for 0^ which 
tests immediate achievement and a. -,ther instrument is used for 0 

4 

which tests transfer. Observation O3 which tests retention is made 
with the same instrument used for 0^. Both observations 0^ ^nd 0^ 
will be made immediately following the treatments and observation 0, 
will be made approximately four weeks ater. 

Po£ula^. From the pilot study conducted in the spri,,,- of 
1971 it was concluded tbat the second or third grade was the proper 
placement for the proposed experiment. In order that all levels of 
the defined aptitude be represented in the instructional treatments, 
it is reconmended that the teach-test treatment include three times 
more subjects than each of the instructional treatments. Thus, if 
each instructional treatment has 30 subjects, 90 subjects would be 
needed for the experiment. 

HZE£thlses. For each dependent variable, achievement, transfer 
and retention, three research hypotheses were posed. In each case one 
of these three was of primary concern in this study ana is desig- 
nated the primary research hypothesis while the other two were of a 
secondary nature. A primary research hypothesis refers tc the 
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interaction of the aptitude with the treatment while a secondary 
research hypothesis refers to the main effect of either aptitude or 
treatment. Thus, the following nine research hypotheses vero posed: 
Primar y research hypoth esis m. There is a .significant 
interaction between the ability to learn about a unit o\ 
length as determined by the teach-test procedure and the 
two treatments when measured by achievement. 

Secondary research h ypothesis lb. There is a significant 
difference between the levels of aptitude on their achieve- 
ment performance. 

^condaL-y, re search hypothesis Ic . There is a sipnifimnt 
difference between the treatment groups on tiieir nchieve- 
ment performance. 

Primary research h ypothesis 2a. There is a significant 
interaction between the ability to learn about a unit oi 
length as determined by the teach-cest procedure and the 
two treatments when measured by transfer. 

Se condary research hypothesis 2b . There is a signif Leant 
difference between the levels of aptitude on tfieir trans! • r 
performance. 
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Secondary research hypothesis 2c . There is a significatu 
difference between the treatment groups on their transfer 
performance. 

Primary research hypothesis ,3a . There is a significant 
interaction between the ability to learn about a unit 
of length as determined by the teach-test procedure and 
tne two treatments when measured by retention. 

Secondary research hy pothesis 3b. There is a si-nificant 
difference between the levels of aptitude on their 
retention performance. 

Secondary research h ypothesis ?c . There is a significant 
dif:.->rence between the treatment gioups on their retention 
performance . 

The arbitrary decision was made to reject the corresponding 
nuli hypotheses at the .05 level of significance. 

Pioposed Stat istical Analysis . Altnoi-gh there exist irore sophis- 
ticated techniques for testing the presence of an aptitude-treatment 
interaction (Cronbach and Snow, 1969, p. 23), there are two common 
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techniques: analysis of variance and regression analysis. In this 
study both of these common techniques would produce the same F-ratios 
(Kelly, Beggs and McNeil, 1969). This study does not lend itself to 
more sophisticated statistical techniques, so that a 3 x 2 ANOVA i? 
proposed (three levels of aptitude by two treatments). 

There are several assumptions which must be met in order to 
justify using an analysis of variance (Havs, 1965, p. 396). First, 
the errors must bo normally distributed for each of the .■.ptit.idc- 
treatment populations. This assumption may be violated provided the 
number of subjects is relatively large. This and the random assign- 
ment to treatment groups should help assure that the F-test woulc be 
unaffected by non-normality. Second, the errors muse have the sare 
variance for each aptitude-treatment group. This assumption may be 
violated without serious consequences if the number of cases in each 
example is the same. Thus, it is planned to have equal or as near to 
equal as possible all cells. Third, and the most important assun,.- 
tion, is that of independence of observations both across and virhi.-, 
treatment groups. The interaction across treatment grr.iips is ■,,l.uv.,rd 
to be held to a minimum, but the use cf indi^iduaJ scores as a b-sis 
of analysis does not meet the assumption of independence within treat- 
ment groups. The investigator was well aware of the violation of this 
assumption. However, it was impossible to secure a large enough sample 
to use the class as a unit. Furthermore, the investigator did not use 
groups smaller than class size as instruction il units because of the 
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interest in the reaction of class size groups to the treatments. 
Thus, the individual was used as the unit of analysis and the results 
will be interpreted in light of this. 
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Chapter V 
EXECUTION OF THE STUDY 



Introdu c tion 

The preceding chapter described the development of the study. 
This chapter reports the events which occurred from the time of the 
selection of the population through the time of the collection of the 
data. A description of the population, oummaries of the teach-test 
procedure and the instructional treatments, the criteria and procedure 
for determining the aptitude levels, and the result, from the obser- 
vations are included. 



Population 

The study was conducted at Hartland Elementary School, Marllarvi. 
Wisconsi.K Hartland has a population of about 3000 and is located in 
WauKesha County approximately 30 mileo west of Milwaukee. Altnough 
priniarily a rural community many of its citizens are employed in 
Milwaukee or other nearby towns. 

Hartland Elementary School has 554 students enrolled in kinder- 
garten through eighth grade. It has its own school board which func- 
tions independently of the other elementary schools in the coun::y. 
The primary classes a -e in a new part of the building; the rooms are 
pJeasan-. asd well equipped. 
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Because of the size of the school all of the second and third 
grade students were used. This provided 110 subjects, 44 from the 
second grade and 66 from the third grude. Both grades were using 
Addison Wesley's Elementary School Mathematics . 1968, for mathematics, 
The second grade was also uc ing AAAS's Science: A Process Approach . 
1968, for the first time. 



Prestudy Ar ranseme.tts 

■Selection of teacher . Miss Marcia Dana, a certified olomentary 
school teacher experienced in teaching primary grades, was selected to 
teach the instructional unit of the teach-test procedure and both treat- 
ments. At thB time of the experiment she was a Project Specialist at 
the Wisconsin Research and Development Center. This position involved 
writing activities for DM? Levels I-IV (kindergarten - second gr=jde) 
^nd working with teachers in developmental schools. Thus, she was 
familiar with DMP's approach and philosophy. 

In ce eloping the activities for the treatments and for tiie in- 
structional unit for the teach-test procedure. Miss Dana gavr valuable 
assistance to the investigator in matters which concerned classroom 
mnnagement or selection of suitable activities. Hence, other than a 
daily review of the purposes and procedures for the next day, it was 
unnecessary to give iier any special training. 

Initial c ontact with the schoo l. The principal of Hartland rio- 
mentary School was contacted about the possibility of conducting the 
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study in his school. When the type of mathemat; ics program whicli im' 
purports and the type of study were explained, he readily consented. 
The proposed dates, Novenber 22, 1971 through December 10, 1971 and one 
day in January, 1972, were acceptable. 

Meeting with the teachers involved . Approximately one week before 
the beginning of the study, both the investigator and the experimental 
teacher met with the six second and third grc^de teachers and an assis- 
tant administrator of the system. At chis time t'le study was explained 
to the teachers and an invitation to attend any of the classes was 
e::tended. Also, the investigator and the experimental teacher became 
familiar with the school and its practices. The teachers appeared to 
be most cooperative; an observation which was confirmed throughout the 
study. 

At this meeting the necessary physical a>-rangeTients and schedules 
were mpde. It was decided that the teach-test procedure would be 
conducted in the individual classrooms for the three days, Monday 
through Wednesday, before the Thanksgiving holidays. A schedule or 
times for each class was arranged which caused the least possible inter- 
ri'ption to the regular schedules of the tea-hers. All the posttesiing 
was scneduled for Wednesday morning. 

'iwo teachers volunteered their classrooms for the instructional 
treatments. Both rooms were approximately the same size and had about 
Che same educational climate. However, one was a second grade class- 
room and the other was a third grade classroom. Because it was incon- 
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venient for the teachers to change the treatment groups daily, it was 
decided to change only at the end of the first week. Two forty-minute 
periods, one from 9:00 to 9:40 and the other from 10:35 to 11:15, were 
scheduled. For the first week Treatment U was to be held from 9:00 to 
9:40 and Treatment N from 10:35 to 11:15- During the; second week 
Treatment N was to be held first and Treatment U second. A summary 
of these arrangements is shown in Figure 5.1. 







Room 1 


Room 2 






9:00 - 9:40 


10:35 - 11:15 


1st 


Week 


U 


N 


2nd 


Week 


N 


U 



Figure 5.1. physical and Temporal Arr mpement for Treatments 

The teachers were told that on the Monday morning after Thanks- 
giving they would be notified about which students were selected for 
each of the treatments. By working with the teachers at their grade 
level they agreed to make the necessary arrangement for the students 
who vjpre not involved during the two periods. Thir involved combining 
classes when one of their level's rooms was in use and planning lossons 
that would not be crucial if missed by those participating in the 
experiment. 
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Summary of the Teach-test Procedure 

The plan and purposes of the teach-test procedure were explained 
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in Chapter IV, The lesson plans, activity sheets and journal may be 
found in Appendix A and the observations 0^ and 0^ may be found in 
Appendix B. A summary of the three days is given here. 

Monday, November 22 , In each of the six classes the first twenty 
minutes were spent on the pretest, Observation 0^, and the last twenty 
minutes were spen^ on Lesson 1. There were 101 students present for 
these sessions. The day went as planned; there was no difficulty in 
admiuistering the test and the children were responsive to the lesson. 

Tuesday, November 23. Each class spent 40 minutes on Lesson 2 of 
the teach-test procedure. Both parts of the lesson kept the children's 
attention. It was evident from some individuals' responses that they 
were still centering on number and ignoring the unit. Likewioa, there 
appeared to be many who were using the relationships between the given 
units in making comparisons. 

Wednesday^^ November 24 . Although 98 students were present for the 
posttest. Observation 0^, not all of these had been present for the 
previous two days. Complete data for 90 students were collected. 
Twenty minutes were allotted in each class for the test. The schedule 
was arranged so that all the tests could be administered in the morning. 
It was interesting to note that the second graders moved through the 
test more rapidly than the third graders did. 

Results of 0^ and 0^ 

Observations 0^ and 0^ were used to determine the aptitvide levels. 
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Before discussing this determination the results of these two obser- 
vations are given and discu^^sed here* These results are given for the 
ninety students who were present for all three days of the teach-test 
procedure • Their scores on 0^ ard 0^ along with their grade, age and 
sex are reported in Table E.l of Appendix E. The first two items on 
both forms 0^ and 0^ were used to test prerequisite behaviors and for 
warm-up items • Only the last twenty items, Lhe items testing the ob- 
jectives of the teach-test procedure, are considered in this analysis 
as well as in determining the levels of aptitude. 

The means, standard deviations, Hoyt reliability coefficients and 
standard error for 0^ and 0^ are shown in Table 5.I. 

Table 5.1 

DESCRIPTIVE STATISTICS FOR 0^ AND 0^ 



Observation Mean Standard Hoyt Standard" 

, Deviation Reliability Error 

0^ (pretest) 9.244 2.079 .340 'l.646 ' 

0^ (posttest) 12.544 3.336 .703 1.772 



As indicated in Table 5.1 the difference b^uween the means of 0^ 
and 0^ was 3.3. A t-test was used to test the significance of this 
difference. A t value of 7.921 showeH that this difference was signif- 
icant at the p < .001 level. 

The Hoyt reliability coefficient ;'as determined by using an in- 
ternal criterion, the score on the observation. The item analysis 
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for 0^ and 0^ may be found in Tables F.l and F.2, respectively, of 
Appendix F. An examination of the item analysis for 0^ revealed that 
there were eight items wJth extremely low r biserials- The same ei^;ht 
items had slightly higher r biserials on 0^ but were the lowest on 
this observation also. 

These eight items were the only items which a child would have 
answered correctly if he was always making the comparison based on 
number alone. These items also had low levels of difficulty. Thus, 
it appears that these items could account for the low reliabilities, 
(A separate analysis of these items and of remaining twelve is 
reported and discussed in Chapter VII^ While there is danger in using 
unreliable instruments, in this case the investigator did not Te^'l 
that this danger was crucial for two reasons. First, the low reli- 
ability was probably due to the eight items which measured a different 
ability from the othtr items. Second, the determination of the aptitude 
levels depended mainly on the score on 0^ whose reliability was much 
higher. This second point is discussed in more detail after the apti- 
tude level determination is explained. 

Determination of Aptitude Levels 

The initial planning called for the three levels of aptitude to 
be determined by specifying a mastery level. Different combinations 
of mastery and non-mastery on the pretest and posttest v;ere designated 
as the levels. It was evident from the pilots of the teach-test 
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procedure and from the teach-test portion of this study that thvvo 
were three levels of achievement on each test. There was the masLc>ry 
group, but there also was a group who had attained many of the behavior 
without reaching mastery and a group who definitely was non-mastery. 
This middle group, the group which had not quite reached mastery, is 
what DMP classifies as "progressing toward mastery". it was decidod 
that studentc in this category on the posttest or on the preLosL who 
fell to non-mastery on the posttest would be dropped from the instruc- 
tional treatments. Thus, the design of the study becomes similar to 
what Cronbach and Snow (1968, p. 21) call an extreme proup design. 
In an extreme group design the middle group is eliminated. This 
decision assured that there would be more difference between the 
Levels I and II. 

Furthermore, it was evident that the behavior of some student^ 
was changing from the pretest to the posttest. Since i,evel 1 was to 
consist of those students who were at a low level of attainment anci 
who were not affected by the teach-test treatment, it was decided to 
add the restriction that those in Level I should have svidenced nc, 
real change. Likewise, since Level II consisted of those who ^hanged 
during the teach-test treatment, the restriction that these subjects 
must have evidenced change was added. 

Thus, in the final c etermination of the aptitude levels two criteria 
Wore usee. First, the level of attainment of the specified behaviors 
and se .nd, the amount of change evidence-' from pretest to posttest. 
A mastery level of 75% was set. Any student in the 55-75:?; range wa,. 
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classified as progressing toward mastery and any student at the 557 
or below level was classified as a non-mastery student. A change of 
more than 15% was necessary to ind'calo a chango due lo instruction.. 
Us.ug Lhosc criteria the level.s as defined are specified as foUow.s: 

Level III consists of those students who had aL:ained mastery on 
both the pretest and the posttest. Thus, their .co-e on both 0^ and 0^ 
must be 75% or bettei (15 points or better). 

Level II consists of those students who had not attained mastery 
on the pretest, but had attained mastery on the posttest and had 
evidenced a change of more than 15%. Thus, their score on 0^ must be 
less than 75% (less than 15 points) and must be 75% or better (15 points 
or better) on the posttest 0^ and their score must have changed ,„ore 
than 15% (a change of more than 3 points). 

Level I consists of those studen'.s who were at the non-nusLerv 
level on both tests and who evidenced no major change. Thus, their 
scores on both 0^ and 0^ must be 55% or less (11 points or Jess) ancJ 
their score must have changed no more than 10% (2 points). 
These levels are summarized in Figure 5.2. 



Change 



Level III s(O^) > 15 s(O^) > 15 (no restriction) 



Level IT 



s(Oj) ■' 15 



5(0^) > 15 



lERiC 



Level I 



s(0^ ) < 11 



[Note: s(O^) means score in points on 0^ , etc.] 

Figure 5.2. Criteria for Aotitude Levels 

25 
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These specifications eliminated any student who 

(a) was at mastery on the pretest, but not on the posttest 

(b) was progressing toward mastery on the pretest, attained, 
mastery on the posttest, but did not evidence enough 
change , 

(c) was progressing toward mastery on both tests.. 

(d) was progressing toward mastery en the prcte.,1, but was 
nonmastery on the posttest, 

(e) was nonmastery on the pr st and only progressing toward 
masterv on the posttest, or 

(f) was nonmastery on both tests, but evidenced too much change. 
Figure 5.3 summarizes all the logical combinations of mastery (.n) , 

progressing toward mastery (p), and nonmastery (n) with the added re- 
strictions of change (c). It also shows how each combination was 
classified for this study and the number of subjects in each classi- 
fication. 

As Figure 5.3 shows these classifications produced 2 students in 
Level III, 27 in Level II and 32 in Level I. Table E.2 in Appendix E 
reports the s^.ores on 0^ and 0^ for those three levels. Twenty-nine 
students were eliminated for one of the six reasons (a-f) listed in 
the previous paragraph. Only one subject was eliminated for each of 
the reasons, c ard d; two were eliminated for reasons a •! d; 
fourteen for reason e and nine for reason f. Table r.l in Appendix k 
gives their scores, as well as the reasons why they were el Imi naie-l . 
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level 0^ level Change Classification Number of 
of mastery of mastery Students 



m 


m 




Level III 


2 


m 


m 




(a) 


2 


m 


m 


c > 3 


Level II 


27 


m 


m 


c < 3 


(b) 


2 


P 


P 




(c) 


1 


P 


n 




(d) 


1 


n 


P 




(e) 


14 


n 


n 


Ic| < 3 


Level I 


32 


n 


n 


|c| > 3 


(f) 


9 



[Note: '\> m means p or n.] 

Figure 5.3. Possible Combinations of Mastery Levels on 0^ and 0 
and Classifications of These Combinations 



These classifications were arbitrar>- with respect to the decisions 
for mastery level and amount of change. Once these decisions were 
made then the classification of an individual was determined. Due to 
errors of measurement two types of errors were possible. First, sub- 
jects may have been eliminated who should have been retained and 
second, subjects may have been retained who should have been eJini- 
nated. Because the remainder of the study depended upoa thoue who 
were retained the second tyre of error was more crucial than th- 
first. Although there is no assurance that this type of error was 
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not made, the method of selection should have held it to a minimum. 
The first criteria of selection depended on Observation 0^, the more 
reliable instrument. After a subject met this criteria he also had 
to meet the criteria for change. Considering the standard error for 
0^ and 0^ (see Table 5.1) there is no evidence to support the state- 
ment that a subject in Level II who had to gain at least four points 
did so because of error in measurement. On the other hand the sub- 
jects in Level I were permitted a change of two points, a change that 
may have been due solely to error in measurement. Furthermore and 
more importantly, it is essential that those retained were in the 
correct levels. Since a score of 15 or more on 0^ was required for 
Level II subjects and a score of 11 or less on 0^ was required for 
Level T subjects, there is no reason to believe that a subject clas- 
sified in one level should have been in the other. 

Because there were only two students in Level III they were 
dropped from the instructional treatments. Thus, only two levels 
were retained. The means and standard deviations on 0^ and 0^ for 
these two levels are reported in Table 5.2. 

Looking at Table 5.2 one observes that there is not much dif- 
ference between the means of Level I and Level II on 0,. A t-tesL 



(t = 1.993) showed that the difference between these means was not 
significant at the p < .01 level. However, the difference betveen 
the means of Level I and Level II on O2 significant at the p .01 
level (t - 24.754). It appears that these two groups were not dif- 
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ferent in the abilities tested before the teach-test treatment, but 
differed significantly in these abilities afterwards. 





Table 


5.2 






DESCRIPTIVE 


STATISTICS 


FOR LEVELS 


I AND II 




Number 


Observation 0 


Observation 0 




Mean 


Standard 


Mean 


Standard 






Deviation 




Deviation 


Level I 32 


8.594 


.701 


9.433 


1.059 


Level II 27 


9.222 


1.474 


16.556 


1.010 



Treatments' Popnlaf jor 

The students in Level I and Level TI were randomly assigned to 
Treatments N and U. Treatment N had 16 Level I students and 13 Level 
II students and Treatment U had 16 Level I students and 14 Level IJ 
students. Although no hypotheses were proposed concerning grade or 
sex, the groups are further described here according to sex and grade. 
There were 9 boys and 20 girls in Treatment N and 11 boys and 19 girls 
in Treatment U. It is interesting to note that although there were 
fewer boys in the treatments there were more boys than girls in Level 
II. There were 9 second graders and 20 third graders in Treatment N 
and 10 second graders and 20 third graders in Treatment U. Most sec- 
ond graders were in Level I; however, as many third graders as sec- 
ond graders were in this level. Figure 5.4 is a complete description 
of the samole. 



ERIC 



87 



Grades 


Treatment N 


Treatment U 




M F 


M F 


2 

3 


1 

2 ' 7 


1 

---©--- 

t 

2 ' 5 

1 


2 
3 


1 

^ . 7 


1 

---©--- 

6 ' 6 
1 



Figure 5 A. Description of Treatments' Sample Accordinj^ to Level, 
Age, Sex and Grade 



Table E.5 in Appendix E contains the scores on 0^ and 0^ by 
treatment groups and levels. 

Summary of the Treatments 

The planned treatments were described in Chapter IV. A brief 
summary of the nine days of treatments is given here. The reader is 
referred to Appendix C for the lesson plans, the activity sheets and 
the daily journal for each treatment. 

Durin^^ the first week the subjects in Treatment U met from 9:00 
to 9:40 and the subjects in Treatment N met from 10:35 to 11:15. For 
the second week the groups interchanged both times and rooms. The 
investigator felt that the time made little difference in the group's 
reactions, but that the third grade classroom was hetter equipped for 
thirty Suudonts. 
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There were few absentees; most of these were not for more than 
two days. Upon their return either the experimental teacher, the 
investigator or another student helped the absentee with the essen- 
tials of what he h.id missed. No one was dropped from the study be- 
cause of absences curing the treatment period. 

The investigator was present in the classroom each day to ob- 
serve the students and to make certain tua- the objectives of each 
lesson were adequately covered. If it was felt that an objective 
needed additional attention the remaining lessons were modified 
accordingly. This occurred only a few times and the adjustments made 
were minor. At times the type of activity was changed to provide a 
change of pace. For example, a game from the last activity was played 
on the seventh day by some of the students in Treatment U. The daily 
journal records such changes. It should be noted that more activities 
usually were planned for each day than the investigator fe]- were 
necessary. Thus, the last part of many lessons were not done on the 
day specified or at a later time. 

There were no major interruptions. The teachers involved and the 
principal and his staff remained cooperative throughout the two weeks. 
The teachers often observed the classes when they did not have other 
responsibilities. 

The children adjusted quickly to the naw routine; they remained 
enthusiastic but reacted naturally to the whole experiment, m par- 
ticular, they enjoyed working at stations which involved moving around 
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the room, contests which involved estimating and the variety of ma- 
terials and pictures. There w. re no apparent differences between 
the students in the two treatment groups. If one day the^ investigator 
felt the lesson was more successful in Treatment N, then ,.t was prob- 
able that the next day it would be just the oprisite. 

There were usually fewer essential points to be covered in Treat- 
ment N than in Treatment U so that class appeared more relaxed. How- 
ever, there was alwa/s plenty to be done; the children were content 
and enjoyed it, but it was often not challenging enough for the better 
students. In Treatment U the better students were challenged- a chal- 
lenge that was often over the head of a slower student. But the slower 
student was not frustrated because he did not even realize the chal- 
lenge and could arrive at an answer which satisfied him. 

The investigator was satisfied that both groups were presented 
with the same type of activities and the same amount of manipulative 
work. The questions asked and the discussions, not the instructional 
mode, accounted for the difference between the treatments. 



Results of Observations 0^ - 0^ 



Three dependent measures were taken: an immediate achievement 

measure (Observation 0 ), a transfer measure (Observation 0 ) and a 

4 

retention measure (Observation 0^). These in^itruments may be found in 
Appendix D. Both 0^ and 0^ were administered on the tenth day, 
December 10, 1971. Observation 0^ took approximately 40 minutes; 
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after an hour break Observation 0^ was given. There was little dif- 
ficulty in administering 0^, the directions were clear and the stu- 
dents understood what was being asked. Observation 0^ was more dif- 
ficult to administer due to the unf amiliarity of the content, to the 
more complicated directions and to thr demand for a second period of 
concentration within one day. Three students were absent for O3 and 
0^. Observation 0^ was administered on January 12, 1972. It was the 
same instrument that had been used for O3 and administering it pre- 
sented no difficulty. The three students who were absent for O3 «nd 
0^ were present, but two others were absent. All five of these stu- 
dents were dropped from the final analysis of variance. 

The raw data for these observations are gi^en in Table E.5 in 
Appendix E. Because 0^ is made from the same instrument as 0^, the 
scores for 0^ follow those for O3. Hence, for each student the 
achievement, retention and, then, transfer scores are given. 

The descriptive statistics for O3 - 0^ are given in Table 5.3. 



Table 5.3 

DESCRIPTIVE STATISTICS FOR 0 - 0 

^ 5 



Observation 


Number 
of Items 


Mean 


Standard 
Deviation 


Hoyt 
Reliability 


Standard 
Error 


O3 (achieve- 
ment) 


30 


16.5179 


6.8357 


.8948 


2.1797 


0^ (transfer) 


25 


6.8214 


2.9487 


.5432 


1.9526 


Oj (retention) 


30 


17.7368 


7.5298 


.9215 


2.0736 
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The mean score on 0^ was slightly higher than on 0^. This dif- 
ference, as will be seen in Chapter VI, was due to the increase in 
Treatment Q's scores from 0^ to 0^. The Hoyt reliability coefficients 
for 0^ and for 0. were high. The mean score for 0^ was extremely 
low. The transfer test(O^) proved to be unexpectedly difficult both 
fron' the standpoint of individuals and individual items. Only 3 
subjects answered correctly over half of the items and the highest 
score was 17. Likewise, only three problems were correctly answered 
by more than half the subjects. The reliability of 0, wis low. Thus 
because of the low mean and reliability of 0^^ any interpretation of 
further statistical analysis arising from this observation must be 
made cautiously. The item analyses for observations 0^ - 0^ may be 
found in Tables F.3 - F.5, respectively, in Appendix F. 

The statistical tests of the hypotheses of this study which 
involve the three observations 0„ - 0^ are reported in Chapter VI. 
Chapter VII contains the conclusions reached based on the statistical 
analyses. 



ERIC 



Chapter VI 
DATA ANALYSIS 
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I ntroductio n 

This chapter presents the data analyses used for testing the 
hypotheses of this study and an analysis of retention ratios. The re- 
search hypotheses which were stated in Chapter IV are summarized hera. 
Each primary hypothesis stated that there is a significant interaction 
between aptitude and treatment. There were three such hypotheses; one 
for each dependent measure. The secondary hypotheses each stated that 
there is a significant main effect. There were six secondary hypotheses; 
a main effect duo to aptitude hypothesis and a main effect due to treat- 
ment hypothesis for each of the three dependent measures. 

P.acb of these hypotheses is discussed within the section deaiing 
with the appropriate dependent measure. In each case it was planned 
to use a 3 X 2 ANOVA but since only two levels of aptitude were found 
each null hypothesis was tested by a 2 X 2 ANOVA. A significance level 
of .05 was set. After the discussion of the hypotheses an examination 
of the retention ratios is reported. 

Achievement Measure 

One dependent variable was achievement; this was measured by obser- 
vation O3. The instrument used for O3 consisted of 30 items which test- 
ed th€ objectives of the treatments and was described in Chapter V. The 
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raw data for O3 may be found in Table E.5 in Appendix E. n.e maximum 
possible score on O3 is 30; the scores for this population ranged from 
5 to 28 with a mean of 16.5. The descriptive statistics for related 
to aptitude levels and treatments are reported in Table 6.1. 



Table 6.1 

DESCRIPTIVE STATISTICS FOR O3: APTITUDE, TRJ-ATMENT, 
AND APTITUDE BY TREATMENT GROUPS 



Groi 



HE. 



Number of Subjects Mean Standard Deviation 



Aptitude 
I 
II 

Treatment 
U 
N 

Aptitude by 
Treatment 
I U 
I N 
II U 
II N 



29 
25 



28 
26 



16 
13 
12 
13 



14.069 
19.320 



20.786 
11.884 



18.125 
9.077 
24.333 
14.692 



6.216 
6.517 



5.412 
4.950 



5.328 
2.397 
3.025 
5.313 
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Figures 6.1 and 6.2 give a graphic picture of the mean scores of 
achievement for the aptitude by treatment groups. Figure 6.1 shows 
that Level II subjects' scores were higher than Level I subjects' scores 
in each of the treatment groups. In Figure 6.2 it appears that Treat- 
ment U was more effective than Treatment N for both levels of aptitude. 
However, note that ia each graph the differences between the corresnond- 
ing ordinates are approximately equal. It appears that there is little 
chance of any interaction. 



9A 




Figure 6.I. Aptitude Effect on Achievement 



Achievement 




Figure 6.2. Treatment Effect on Achievement 



These observations were examined by testing the following null 
hypotheses : 

H.la: The difference between the mean score on achievement 

of the Treatment U Level I group and the mean score on 
achievement of the Treatment U Level II group is equal 
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to the difference between the mean score on achieve- 
ment of the Treatment N Lovel I gr-^iip and the moan 
score on achievement of the Treatment N Level ]] 
group* (In other words, there exists no interaction 
between the aptitude level and the treatment as 
measured by achievement.) 

H.lb: The mean score on achievement for Level I equals the 
mean score on achievement for Level II. 

H.lc: The mean score on achievement for Treatment U equals 
the iXiean score on achievement for Treatment N. 

The f:irst hypothesis H.la corresponds to the primary research 

hypothesis for achievement and the other two correspond to the two 

secondary hypotheses for achievement. Statistics relevant to testi 

each of these null 1 ypotheses are presented in Table 6.2. 

Tnhle 6.2 

ANALYSIS OF VARIANCE FOR 0^ (ACHIEVi->n^NT) 



Souice 


df 


MS 


F 


P " 


Aptitude X Treatment 


1 


1.732 


.063 


.8032 


Aptitude 


1 


468.086 


25.055 


.0001 


Treatment 


1 


1166.020 


62.47 ' 


.000] 


Error 


50 


18.682 







As can be seen fro,t Table 6.2 hypothesis H.la can noC be reiected, 
Statistically, the interaction as measured by acliievcment between 
aptitude and treatment was not significant. Both hypotheses 11.1b and 
H.lc are rejected. There is statistical evidence that both the apti- 
tude effect and the treatment effect were significant. Thus, the 
^ analysis of var-tance supports the observations made fvom Figures 6.1 
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and 6.2. 



Transfer Measure 

The second dependent variable was transfer; this was mpasured ^3' 
observation 0^. The instrument used for 0^ was discussed in Chapter 
V. The raw data for 0^ may be found in Table E.5 in Appendix E. Out 
of a possible score of 25 the highest score was 17 and the lowest score 
was 0. From the mean of 6.8 one sees that the test was extremely dif- 
ficult for this population. The descripcive statistics for 0^ related 
to levels and treatments are reported in Table 6.3. 



Table 6.3 

DESCRIPTIVE STATISTICS FOR 0^: APTITUDE, TREATMENT 
AND APTITUDE BY TREATMENT GROUPS 



Group 



Aptitude 
I 
11 

Treatment 
U 
N 

Aptitude by 
Treatment 
I U 
I N 
II U 
II N 



Number of Students 



Mean 



Standard Deviation 



29 
25 



28 
26 



16 
13 
12 
13 



5.690 
8.000 



7.25 
6.23 



6.688 
4.462 
8.000 
8.000 



2.054 
3.277 



7.009 
3.081 



1.852 
1.613 
3.384 
3.364 
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The graphs of the mean scores on transfer for the aptitude by 
treatment groups are found in Figures 6.3 and 6.4. 
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Lransfer 



8- 
6- 



2- 



u 




N 



Treatrient 



Figure 6.3. Aptitude Effect on Transfer 

Transfer 
8 

6- 
4 — 
2- 




-j p Aptitude 



I IL 

Figure 6.4. Treatment Effect on Transfer 
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From Figure 6.3 it appears that Level II was superior to Level I 
for each of the treatments. Thus, there appears to be a significant 
main effect due to aptitude. Figure 6.4 shows that Treatment U was 
slightly superior to Treatment N but only for Level X students. 
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These differences were statistically examined when the following 
null hypotheses were Lest .id: 

H.2a: The difference between the mean score on transf-r of 
the Treatment U Level I group and the mean score on 
transfer of the Treatment U Level II group is equal 
to the difference between the mean score on transfer 
of the Treatment N Level I group and the mean score 
on transfer of the Treatment N Level IT group. 

H.2b: The mean score on transfer for Level I equals the mean 
score on transfer for Level II. 

H.2c: The mean score on transfer for Treatment U equals the 
mean score on transfer for Treatment N. 

Hypothesis H.2a corresponds to the primary research hypothesis for trans- 
fer and H.2b and H.2c correspond to the secondary ones. 

Table 6.4 shows the results of the analysis of variance used to 
test these hypotheses. 





Table 


6.4 






ANALYSIS 


OF VARIANCE 


FOR 0, 
4 


(TRANSFER) 




Source 


df 




F 




Aptitude X Treatment 


1 


16.534 


2.344 


P 1 

.1321 


Aptitude 


1 


76.663 


10.869 


.0019 


Treatment 


1 


19.005 


2.694 


.1070 


Error 


50 


7.053 
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From the analysis reported in Table 6.4 the following results were 
found. The F ratio used to test hypothesis H.^.a was 2.344 which was 
not significant at the .05 level. Thus, hypothesis K.2.i is not 
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rejected. There is statistical evidence of a significant aptitude 
main effect; however, the treatment main effect hypothesis H..2c is 
not rejected. 

Retention Measure 

The third dependent variable was retention; tliis was measured by 

observation 0^. The instrument used for 0^ was the same as the one 

5 5 

used for 0^ and is described in Chapter V. The raw data for 0^ is in 
Table E.5 in Appendix E. The scores ranged from 5 to 30 and the mean 
was 17.7. The descriptive statistics associated with 0^ for aptitude, 
treatment and aptitude by treatment groups is presented in Table 6.5. 



Table 6.5 

DESCRIPTIVE STATISTICS FOR 0^: APTITUDE, TREATMENT AND 
APTITUDE BY TREATMENT GROUPS 

Mean Standard Deviation 



Group 



Aptitude 
I 
II 

Treatment 
U 
N 

Aptitude by 
Treatment 
I U 
I N 
II Ij 
II N 



Number of Sc bj .'cts 



29 
25 



28 
26 



16 
13 
12 

13 



15.310 
20.560 



22.964 
12.115 



20.625 
8.769 
26.083 
15.462 



7.087 
6.777 



4.834 
5.075 



4.440 
2.920 
3.450 
4.612 



Patterns of differences similar to those for the achievement 
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measure can be seen in Figures 6*5 and 6.6* It appears that there are 
strong aptitude and treatment effects, but no interaction. 




Treatment 



Figure 6.5. Aptitude Effect on Retention 




^ Aptitude 



Figure 6.6. Treatment Effect on Retention 
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These observations were examined statistically when the followinf^ 
null hypotheses were tested: 

H,3a: The difference between the mean score on retention of 
the Treatment U Level I group and the mean score on 
retention of the Treatment U Level II group is equal 
to the difference between the mean score on retention 
of the Treatment N Level I group and the mean score 
on retention of the Treatment N Level II group. 

H.3b: The mean score on retention for Level I equals the 
mean score on retention for Level II. 

H.3c: The mean score on retention for Treatment U equals 
the mean score on retention for Treatment N. 

Again, the first hypothesis is the one of primary interest since it 
corresponds to the research interaction hypothesis for retention. The 
other two correspond to the main effects research hypotheses for 
retention. 

The statistics associated with the analysis of variance used to 
test hypotheses H.3a, H.3b, and H.3c are reported in Table 6.6. 



Table 6.6 

ANALYSIS OF VARIANCE FOR O, (RETENTION) 



Source 


df 


MS 


F 


P < 


Aptitude X Treatment 


1 


5.081 


.324 


.5718 


Aptitude 


1 


490.331 


31.263 


.0001 


Treatment 


1 


1707.081 


108.841 


.0001 


Error 


50 


15.684 







ERIC 



As Table 6.6 shows, the primary null hypothesis H.3a is not re- 
jected. Again there is no evidence of any interaction. Bc..i of the 
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hypotheses H.3b and H.3c are rejected. There is statistical evidence 
of a main effect due to aptitude and a main effect due to retention. 
Thus, the analysis of variance supports the observations made from 
Figures 6.5 aid 6.6. 
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Retention Ratios 

In addition to the hypotheses tested in this study, questions 
about retention were asked. In particular, the following two Questions 
were posed: 

1. To what extent are the performances on achievement correlated 
with performances on retention measured four and one-half weeks later? 

2. How much retention was there? 

For the entire group of 54 subjects the correlation between achieve- 
ment and retention was .72. Data for individuals on the achievement 
observation 0^ and the retention observation 0^ are reported in Table 
6.7. From the achievement measure to the retention measure 35 subjects' 
scores improved, 14 scores declined, and 6 scores remained the same. 
Of the 34 scores that improved, the range of improvement was from 1 
to 8 raw score points. Of the 14 scores which declined, the ran^e 
was from 1 to 6 raw score points. Retention ratios (amount retained 
divided by amount achieved) varied from .60 to 2.00 with 40 subjects 
having retention ratios of 1.00 or better. 

The amount retained for the entire group is indicated bv the 
estimated mean scores. The mean score on 0^ was 16.5 and the mean 
score on 0^ was 17.7. The retention ratio of 1.07 indicates that the 
subjects as a group did better on the retention observation than on 

/13 



the achievement measure. 
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Table 6.7 

INDIVIDUAI. RESULTS ON ACHIEVEMENT (0^), RETENTION (0^) 
AND RETENTION RATIOS 







Treatment U 




Treatment N 








Retention 






Retention 




°3 




Ratio 


°3 




Ratio 




13 


20 


1.54 


5 




J. . DU 




23 


20 


.87 


7 


Q 






16 


19 


1.19 


11 


7 


.64 




5 


10 


2.00 


15 


13 


.87 




16 






9 


7 


.11 




18 


20 


1.11 




15 


1.50 


LEVEL L 


26 


26 


1.00 


10 


6 


.60 




]5 


20 


1.33 


9 


11 


1.22 




19 


27 


1.42 


10 


7 


.70 




21 


24 


1.14 


7 


7 


1.00 




14 


16 


1.14 


8 


8 


] .00 




23 


16 


.70 


8 


5 


.63 




25 


24 


.96 


9 


11 


1.22 




19 


25 


1.32 










15 


17 


1.13 










24 


18 


.75 


13 


14 


1.08 




25 


25 


1.00 


11 


10 


.91 




22 


27 


1.23 


9 


11 


1.22 




28 


30 


1.07 


24 


21 


.88 




25 


26 


1.04 


15 


15 


1.00 


LEVEL II 


20 


25 


1.25 


20 


25 


1.11 




28 


29 


1.03 


17 


20 


1.18 




26 


28 


1.08 


25 


23 


.92 




26 


29 


1.11 


9 


12 


1.33 




24 


25 


1.04 


11 


10 


.91 




26 


29 


1.11 


12 


15 


1.25 




18 


22 


1.22 


13 
12 


15 
12 


1.15 
1.00 
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Summary 

Each of the null hypotheses which corresponded to a primary re- 
search hypothesis concerning interaction is not rejected. That is, no 
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significant interaction was found for any of the throe dependent 
measures. All except one of the null hypotheses which corresponded to 
the secondary research hypotheses are rejected. That is, there were 
significant main effects, both aptitude and treatment, for each of the 
dependent measures with the exception of the treatment main effect for 
transfer. 

The examination of retention ratios revealed little difference in 
retention due to treatment or aptitude. The extremely high retention 
ratios indicated not only that most students retained what they had 
learned but also that their performance improved. 
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Chapter VII 
CONCLUSION TO THE TRESIS 



Introduction 

After giving a brief sununary of the study and its limitations, 
this chapter discusses the conclusions, the implications for curri- 
culum development and the recommendations for future research. 
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Summary 

The main purpose of this study was to examine the interaction of 
two treatments on measuring area with various levels of aptitude. The 
research strategy adopted was that of an aptitude-treatment interaction 
study. Most of the past educational ATI studies fit what Salomon 0971) 
described as the preferential model. This model prescribes treatment 
differing on form or mode and capitalizes on the learners' best general 
capabilities. This study approached the ATI question in a different 
manner^ General capabilities were not u* ed as measures of aptitude and 
the treatments did not differ in form or mode. Aptitude was defined in 
terms of the individual's ability to learn specific concepts associated 
with a unit of length measurement- The treatments were designed to 
differ only in their emphasis on the unit of area measurement. More 
specifically, the question asked was: In what manner does the ability of 
children to learn concepts associated with a unit of length affect the 

105 
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exteit to which they attain concepts associated with area and a unit of 
area for each of the l-wo given treatments? 

In order to determine this ability 90 second and third p.rader^; 
were subjected to a teach-test procedure. This procedure consisted of 
a pretest, a brief Instructional treatment and a posttest all of which 
testnd or taught unit of length concepts. The results of the two tests 
were used to determine the aptitude levels. Although three levels of 
aptitude were expected only two subjects met the criteria for the 
highest level. They were dropped from the remainder of the study along 
with those students who did not fit the definition of either of the 
other two levels. iVenty-seven and thirty-two students were classified 
as Level II and Level I, respectively. The students in each of these 
levels were randomly assigned to one of two treatments, Treatment U or 
Treatment N. 

Both treatments had the same behavioral objectives, the same teacher, 
the same duration ^9 days) and the same mode of instruction. They 

rfered in the amount of emphasis on the unit of measure for area; Treat- 
mt.'st U emphasized the unit and Treatment N did not. After the treat- 
ments three measures, achievement, transfer and retention, were taken. 
These measures were used to test hypotheses about the interaction of 
the aptitude levels and the treatments and about the main effects of 
aptitude and of treatments. 

No significant interactions were found between the levels and treat- 
ments on any of the measures. There were significant main effects due 
to level of apr.itude and to treatment for achievement and retention 
measures. The only significant tiain effect for the transfer measure 

/I'? 
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was apuitude* 



Limitations 

Wliile the sources of internal validity (Campbell and Stanley, 1963) 
were controlled through the design of the study, some of the sources of 
external validity which permit penerali/ation were not amenable to con- 
trol. The first source of external validity which was not controlled 
was the interaction of selection and treatment. The school selected 
was a rural school whose staff was most cooperative. Thus the results 
must be interpreted for this population. A second source of invalidity 
may have been what Campbell and Stanley (1963) call reactive arrange- 
ments - ''the patent artificiality of the experimental setting" (p. 20). 
mUe every effort was made to have a normal setting there is no way to 
measure the effect of the experiment itself. In this case it was not 
felt that the students in the two groups reacted differently Lo Lhr ex- 
perimental setting. It was more probable that the biases of the investi- 
gator and of the teacher, or the unusual mathematical expertise of the 
teacher would have accounted fo: any reactive arrangement invalidity. 

In interpreting the results one must also consider the reliability 
of the instruiaents. These reliabilities were reported in Chapter V. 
The reliabilities of the achievement and retention observation were 
respectable, but the reliability of the transfer test as well as the 
level of the difficulty of its items make any transfer findings suspect. 

Finally, the statistical analysis calls for independent ohserv.i- 
tioas. The use of individuals for the unit of analysis makes ;iny 
results questionable. However, the clear cut results of treatment and 

do. 
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aptitude level effects reduces the questionability. This limitation 
was discussed more fully in the statistical analysis section of 
Chapter IV. 



Conclusions 

This section discusses the coaclusions relevant to the teach-tost 
procedure and then the conclusions relevant to the hypotheses or the. 
study. 

Tlie teach-test procedure produced two distinct levels of aptitude. 
The mean scores for Level I and for Level II on tho pretest, 0^, weri 
8.59A and 9.222, respectively. On the posttest, O2, the mean score for 
Level I had not changed significantly (p < .05). However, the mean 
score for Level II was 16.556 which was a significant change (p - ,05). 
Likewise, the difference betwe.n the posLtest mean scores for the lwo 
groups was significant (p - .05). Furthermore, for the dependent 
measures of achievement and retention there was a significant apLitu.lc 
effect. Thus, the teach-test method was successful in predicting the 
ability of students to learn concepts associated with raeasuri..g area. 

These conclusions should be interpreted in light of the reliability 
coefficients .340 and .703 for the two observations 0^ and 0^, respec- 
tively. Neither reliability coefficient was extremely high. This prob- 
lem first was adJressed in Chapter V. A closer examination of the re- 
liabilities was made in the following manner. 

In both the instruments used for 0^ and 0^ there existed what 
appeared to he two categories of questions which could be answered 
correctly for different reasons. The first category consists of eight 
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questions which would be answered correctly if the child was only re- 
sponding to the numbers given and ignoring the unit* For example, he 
was asked to order two objects which measured 8 and 6 of unit a, re- 
spectively. The second category consists of twelve questions wl>ich 
cannot be answered correctly if the child was only respondinp. to the 
number and ignoring the unit. For example, he was asked to order two 
objects which measured 10 of unit a and 10 of unit b, respectively, 
when the two units in question were not equal • 

Separate Hoyt reliability coefficients for each of these categories 
for each observation 0^ and 0^ were found. For both subscales of 0^ 
the reliability coefficients were .82 compared with the reliability co- 
efficient of .34 for the entire scale. For 0^ the reliability of the 
first category of items (subscale 1) was .73 and the reliability of the 
second category of items (subscale 2) was .84 compared with the reli- 
ability of .70 for the entire scale. Tba item analyses for these sep- 
arate subscales are found in Tables F.6 and F.7 in Appendix F 

It is importa it that decisions made from instruments are ra^de from 
reliable ones. Tne subscales were more reliable than the entire scales 
for both 0^ and 0^. It was thus decided to compare the selection of 
aptitude levels based on the entire scale with the selection based on a 
subscale . 

Since the items on the second category were more discriminating the 
scores of students on this category for both 0^ and 0^ were considered. 
By subtracting the 8 possible points which may have been attained on 



the first category items, new criteria for levels and for elimination 
are shown in Figure 7.1. 
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Using these new criteria only two stuuents in Level II would not 
have been reclassified as Level II. Both of these would have been class- 
ified in Level III under this system. On the other hand 9 of the 32 
Level I students would have been reclassified, only 3 of these would 
have been placed at a higher level and the other 6 would have been 
eliminated. Only five of the students who were originally eliminated 
would have been retained, two in Level II and three in Level I. 

^1 ^2 Change Classification 

S(0^)*> 7 SCO^) > 7 Level III 

S(O^) > 7 SCO^) < 7 (a) 

S(Op < 7 SCO^) > 7 c > 3 Level 11 

S(Oj^) < 7 S(O^) > 7 c < 3 (b) 

3 < S(O^) < 7 3 < SCO^) < 7 (c) 

3 < S(O^) < 7 SCO^) < 3 (d) 

S(O^) < 3 3 < SCO^) < 7 (e) 

S(0p £3 s(O^) < 3 |c|< 3 Level I 

SCO^) < 3 SCO^) < 3 |c|> 3 (f) 

* 

S(0,) indicates score on subscale 2 of 0^ and c indicates 
amount of change between 0^ and 0^ on subscale 2. 



Figure 7.1. Classification for Aptitude Levels Based 
on Subscale 2 of 0^ and 0^ 
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Figure 7.2 shows tlie mean scores on acli levement for ilioso hLikIimUk 
wlio were in Treatments U and N for both the new classification and 
the classification used in the study. 



Classification Using Entire Scale 
U 



N 



Reclassification Using Subscale 2 

N 



U 



I 


18.2 


9.0 




I 


18.1 


9.1 


II 


24.6 


14.5 




II 


24.3 


14.7 




Figure 7.2. 


Mean Scores on 


Achievement 


According to 



Two Classifications 



Since such small differences existed no further analysis was made on the 
achievement data. Because the recencion data was so similar to the 
achievement and the transfer measure itself was not very reliable 
these were not examined in light of the recla&cif ication. 

There is no doubt that this subscale was more reliable than the 
entire scale. However, the conclusions reached using the entire scale 
for classification appear to be no different from the classification on 
the subscale. Furthermore, the subscale did not test all the be- 
haviors desired. It was desirable, as was the case of the entire scale, 
that some iterus could be ansv/ered correctly by comparing only the 
numbers involved. Otherwise there was no way to determine the students 
who focused on the length of the unit alone and ignored the measure. The 
three students in Level I who would have been classified in higher levels 
on the subscale classification were apparently looking only at the length 
of the unit and not at. the number of units, an error as crucial to 
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detect as the one of centering on measure alone. Thus, while it Is de- 
sirable to use reliable inscruments the investigator feels confident 
that the lack of high reliability in this case did not affeci the final 
conclusions . 

The analysis of the data related to the three primary hypotheses 
was reported in Chapter VI. All three analysis of variance tests re- 
vealed that there was no significant interaction between treatment and 
aptitude. Thus, the study failed to produce the desired interaction. 
However, this conclusion must be considered in regard to the fact .hat 
only two levels of aptitude were established by the teach- test procedure. 
The third level, the highest level of aptitude, was not present in this 
population. It was expected that this level of student would do equally 
well under either treatment which would have helped to produce the desired 
interaction. 

The analysis associated with either the dependent measure, achieve- 
ment, or the dependent measure, retention, substantiates the same con- 
clusions. In examining the main effects hypotheses, it was found that 
Treatment U was significantly better than Treatment N for both levels 
of aptitude. In regards to achievement. Treatment U was so strong that 
there was little chance of interaction^ The mean score for Level I 
students in Treatment U was higher than the mean score for Level II 
students in Treatment N. Thus, for either level, as far as achieve- 
ment of the specified objectives. Treatment U is preferable to Treat- 
ment N. Likewise, there was a significant aptitude main effect; those 
in Level II did better than those in Level I. Retention data and 
associated analysis supports these conclusions. 
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The achievement retention data were examined more closely In 

relation to these three questions: 

1. Is the amount of retention a function of the aptitude level? 

2» Is the amount of retention a function of the particular 
treatment? 

3. Is the amount of retention a function of the interaction 
of aptitude and treatment? 

Table 7,1 includes the groups' mean scores on achievement (0^) and 
retention (0^) and the associated retention ratios* 



Table 7,1 

GROUP MEANS ON ACHIEVEMENT (0 ) AND RETENTION (0 ) 



Group 



AND RETENTION RATIOS 



Retention Ratios 



Aptitude 
I 
II 

Treatment 
U 
N 

Aptitude by 
Treatment 
I U 
I N 
II U 
II N 



14.1 
19.3 



20.8 
11.9 



18.1 
9.1 
24.3 
14.7 



15.3 
20.6 



23.0 
12.1 



20.6 
8.8 
26.1 
15.5 



1.09 
1.06 



1.11 
1.02 



1.13 
.97 
1.07 
1.05 



As one can see from Table 7.1, for Level I there was an increase 
of .8 from achievement to retention and for Level II an increase of 1.3, 
The retention ratio for Level I was 1.09 and for Level II was 1.06 
indicating that both groups' percentage of increase was approximately 
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the same. Looking at the individual data in Table fi • 7 for these two 
groups one notes that nine students in Level I decreased in performance 
from to 0^ and five in Level II decreased* Thus, it appears that the 
aptitude level had little effect on the amount of retention. 

The amount of retention for each treatment group is reported in 
Table 7*1. The mean score of Treatment U group increased by 2.2 from 
achievement to retention while Treatment N group's mean score only 
increased .2. The retention ratio for group U was 1.11 and for group N 
was '4.02. The individual data reported in Table 6.7 indicates a similar 
pattern. Only 4 subjects* scores decreased in Treatment U while 10 
:.cores were lower on retention than on achievement in Treatment N. These 
retention ratios, the raw mean score gain and individual data seem to 
indicate that Treatment U was slightly more effective than Treatment N. 
This must be interpreted in light of the findings reported in the ne^it 
paragraph. 

If the amount of retention for each aptitudci by treatment group is 
examined (see Table 7.1) one finds that the only group which decreased 
from achievement to retention was Treatment N Level I. On the other 
hand Treatment U Level I group showed the most increase. These groups 
appeared to contribute substantially to the conclusions reached in 
the previous paragraph about the effect of the treatment. While no 
further analysis was carried out, it appears that the amount of reten- 
tion was affected to some extent by the interaction of aptitude and 
treatment. 

The retention was extremely high for all groups. As far as the 
investigator could ascertain there was no teaching of the concepts 
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between the achievement observation and the retention observation* Much 
of the time between the two observations was a vacation period. There 
are several possible explanations for the high retention* It is possible 
that the achievement scores were lower than they would have been if the 
observation had not occurred prior to a major vacation* It is possible 
that the achievement test was a learning experience and this interacted 
with the retention observation. Or it is possible that subjects in 
Treatment U found the learning meaningful and subjects in Treatment N 
overlearned the objectives they learned; both of these are often used to 
explain high retention. The design of this study did not permit In- 
vestigations of any of these possibilities. 

The conclusions related to the transfer measure are not as clear- 
cut. Although there was a significant aptitude main effect, there was 
not a significant treatment main effect. The low reliability of the 
instrument and the high level of difficulty made interpretation of the 
results unsubstauciable . 

Examining thp instrument used for 0^ (transfer) more closely re- 
veals that many c-f the items differed from the objectives in two dimen- 
sions; attribute and type of comparison question. The comparison ques- 
tion which asks about the inverse relation between c:. object and its 
measure or between an object and the unit of measure had not been asked 
previously. The instrument should be reconstructed to include more 
items that differ in only one dimension. 

There were four items (4-7) on area which are very similar to the 
ones on the achievement test; the only difference is that unit?^ not 
called for in the treatments are used. ITie percent of correct responses 
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for the different groups of this study is shown in Figure 7.3. 



U N 



.59 


.21 


.64 


.39 



Figure 7. 3* Percent of Correct Responses on 
Items 4-7 of Observation 0, 

No statistical tests were run on this data; however, one notices that 
the ability to work with unfamiliar units favored Treatment U. It 
appears that the results of the transfer measure might have been dif- 
ferent if nore items had been of a near transfer type rather than of a 
far transfer type» That is, more items are needed which differ from 
the objectives on only one dimension tather than on two dimensions. 

Inylications for Curriculum Development 

One of the purposes of this study was of a formative nature - 
to generate knowledge about mathematics instruction which could be 
used in a mathematics program. This is the purpose addressed here as 
information gained through subjective observations and objective test- 
ing is reported* All recommendations made here are in reference to the 
sample in this study* The background of the popuJation involved should 
be carefully considered before adopt in^.^ any of those recommendalion.s . 

First, tie feasibility of teaching area concepts to second and 
third graders was examined. From observations during the treatments 
it is clear that many of the behaviors prerequisite to terminal 
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behaviors were easily obtained by the majority of the students • From 
the results of the acbieveiHent test it is evident that many second and 
third graders were successful in exhibiting the terminal behaviors. 
However, the significant main effect for aptitude shows that the 
ability of children at this age to attain these behaviors differs great- 
ly. Although no statistical analysis was done on the difference in 
achievement due to grade the data shows some interesting trends. The 
mean score on achievement for the second graders was 13.2 and for third 
graders was 18.7. This is misleading unless one recalls that most of 
the second graders* were in Level I. Looking more carefully at the 
grades within levels, one notices on Level I that the second graders* 
mean is only one point lower than the third graders. Furthermore, the 
second graders in Treatment N did slightly better than the third 
graders. There were only three second graders in Level II; their scores 
were in the range of the third graders* scores. Second graders in 
Treatment U did slightly better than the third graders in Treatment N. 
Thus, it appears that the proper placement of these area concepts depends 
more on the child's ability as defined by the teach-test procedure of 
this study than on grade level and on the type of treatment. In de- 
signing a curriculum it is recommended that Treatment U be begun in the 
second grade and extended through the third grade. 

When the achievement data was examined in terms of sex differences 
it was found that boys did better than girls in Treatment U, but the 
opposite was true for Treatment N. The direction of the difference in 
Treatment U could be expected since there was a much greater percentage 
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of boys in Level II than in Level I and a greater percentage of the r,irls 
were in Level I than Level II. Approximately the same division of boys 
and girls occurred in Treatment N, so the fact that girls did better 
Chan the boys is surprising. Again > no further statistical analysis 
was done; this information is reported only to support the recommendation 
that no difference in treatments seems to be necessarv based onlv on 
sex* 

Carpenter (1971) in discussing the implications from his study for 
inrtruction concludes: 



''These results do not imply that experiences with 
different units of measure should not be included 
in measurement topics. They do imply, however, that 
many young children will not master all the impli- 
cations of different units by concentrating on 
measurement processes. If one is really concerned 
with mastery of measurement concepts with different 
units of measure, it would seem necessary to pro- 
vide a wide range of experiences that help a child 
to focus on more than one immediate dimension. 



This study verifies this observation made by Carpenter. The 
children in Treatment U who were given experiences with different units 
of measure behaved differently from those in Treatment N. The scores 
on the achievement and retention observations were significantly dif- 
ferent for the two treatments. 

In looking more carefully at these observations several other 
patterns of answers became obvious. One of the most striking was the 
recording of the unit when assigning measurements to areas. In both 
treatments whenever the teacher wrote the area she recorded both the 
number and the unit. One treatment did not overtly stress the writlnp, 
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of the unit more than the other. However, the chj.ldren in Treatment U 
saw the necessity for recording the unit. Almost without exception 
every child in Treatment U recorded the unit on avery problem on both 
the achievement and the retention test. No students in Treatment N 
recorded the unit. Thus, if curriculum developers want the child to 
respond with the unit it seems to be necessary to present them with 
problems in which the unit makes a difference. In only presenting, prob- 
lems of comparing two areas which are covered with the same unit the 
child has no reason to focus on the unit and therefore does not record 

it e 

There were eighteen problems on the achievement measure which asked 
the child to compare two regions which had been covered. These problems 
differed on two dimensions: (1) whether or not the regions were covered 
exactly and (2) whether or not congruent units were used to cover both 
regions. Figure 7.4 shows the number of problems in each category. 
Figure 7.5 shows the average level of difficulty (the average percentage 
of correct responses for each treatment and each level). 



Exact 



Non-exact 



Congruent 
Non-congruent 



2 


4 


oo 


4 



Figure 7.4. Number of each type of comparison problems 
which involved two covered regions on 0^ 
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Treatment 




U 


N 


U 


N 


I 


88 




91 


56 


29 


II 


100 




100 


71 


31 


I 


70 




21 


36 


10 


II 


91 




39 


70 


27 



Figure 7 ,5. Precentage of correct lesponses to comparison 
problems which involved r.wo covered regions 
on 0^ 



One notices in Table 7.5 that problems in which the coverings were 
exact and congruent units were used were the easiest for all groups • 
This is not surprising since only the numbers needed to be compared with- 
out regard to the vnit. Likewise, the problems in which the coverings 
were non-exact and non-congruent units were used Droved to be tlie most 
difficult for all groups* However, for Level II, Treatment U there was 
little difference in the difficulty due to the type of unit. A similar 
analysis of these scores on the retention measure indicated no dif- 
ference between those covered with congruent and non-congruent units 
for this group. One may also observe that for Level I, Treatment N 
the second least difficult type of problem was the non-exact congruent. 
For every other group the exact, non-congruent wr.^ the second least 
difficult. This is consistent with the children's opportunity to li?arn 
and the results of the teach-test procedure. The snbiccts in Treatment 
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N had not been exposed to problems involving non-congruent units of 
area but had been given non-exact coverings. Thus, one would ex'pect 
those in Level 1, Treatment N to do better on the non-exact, congruent 
than on the exact, non-congruent problems. On the other hand the students 
in Level II, Treatment N were the ones who could nandle non-congruent 
units of length in the Leach-test procedure so it is not surprising 
that they could handle these problems with area. 

Looking at the difference between th5> two treatment groups U and 
N and their difficulty levels on exact and non-exact problems one finds 
that 81% of the exact and 56% of the non-exact problem responses were 
correct for Treatment U students and 43% of the exact and 24% of the 
non-exact problem responses were correct for Treatment N. Thus, for 
both groups the non-exact problems were more difficult and the difference 
between the difficulty level was approximately the same. 

However, if one makes a similar analysis between the congruent and 
non-congruent problems one finds that the level of difficulty between 
these two types was about the same for Treatment U, Seventy-three per 
cent of their responses were correct for the congruent type and 70 per 
cent for the non-congruent. But for Treatment N the non-congruent type 
was twice as difficult. Only 26 per cent of their responses were 
correct for the non-congruent type and 52 per cent were correct for the 
congruent . 

In developing curriculum materials one should keep these results 
in mind. It appears that seconr^ and third graders are capable of 
handling problems involving non-congraent units, but they must be pre- 
sented the opportunity. Also, special care must be given to problems 
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involving non-exact coverings since thev ^^pear to he more dJfftcuJt 
than the exact ones. 

Several subjective observations were made during the treatments 
concerning the unit. The children in Treatment U were more challenged 
by the problems they were presented. The better students in Treatment 
N had few experiences which stretched ^heir ability. They probably could 
have learned everything presented in a much shorter Lime. 

Further evidence uf the ablUiy of a child of Llils age to work 
with different units of measure was given by the toach-tcst procedure. 
As was shown by the results of 0^ and O2 some of the children were and 
some were? not capable after two days of instruction to handle the re- 
lationshipr* between different units of measure. It is interesting to 
note that many of those students in Level I who could not work success- 
fully with different units of length after the two day instruction period 
could handle such relationships with units of area after nine days of 
instruction and experience. 

The results of the retention observation further supports the re- 
commendation to design a curriculum wliLch approaches area concepts as 
Treatment U did rather than as Treatment N. Furthermore, if one looks 
at the retention data in relation to the achievement data (see Table 
E.5 in Appendix E and Table 6.5 in Chapter 6) one finds more evidence 
to support this recommendation. The mean of the achievement observation 
was 2*2 lower than the mean of the retention observation for Treatment 
U, but only .2 lower for Treatment N. In Treatment U 22 of the 28 
students showed an increase from the immediate achievement to the 
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ret^-^iion while 12 of the 26 students in Treatment N increasecK Mine 
Treatmfji t U subjects increased more tlmn three points (from four tc 
eight points) but orly one Treatment N subject increased more than 
three points* One of the aims of curriculum development is to con- 
struct programs wh?uh increase retention and it appears that Treat- 
ment 11 is slightly superior to Treatment N in regards to this* 

Because of the difficulty level and the low reliability of the 
transfer test any recommendations with regard to transfer would be 
questionable* Since the third level of aptitude was not found in the 
teach-test procedure, there was no way to measure the transfer from 
length to area. It had been hypothesized that students in Level III 
would do equally well under either treatmen; because of their ability 
to transfer their knowledge about the unit of length to the unit of 
area, llius, no specific recommendations regarding transfer can be made 
from objective observations. 

Finally, some subjective observations about the type of instruction 
are relevant to curriculum devc]opment. The activities developed for 
this study were appropriate for this age group and were manageable. 
Many of the activities required extensive preparation, but much of the 
preparation could be done by students or the activities modified to 
simplify the preparation. The activities which h^ld rJie children's 
interest longest were those which ^'nvolved the children with materials 
and problems. The comparison problems were more motivation;^l than ones 
which required merely assigning measurements. The children enioyed 
the contests and games which added needed variety. The characters, the 
strange houses and the short stories told by the teacher to introduce 
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them served two purposes. First, they gave a way to introduce many 
ideas and to tie the activities together. Second, they held the 
children's interest and sparked theii imaginations • When asked what 
they liked best the mcsL responded that they liked the characters- Other 
liked the stations, the puzzles and the "snake*' test (the teach-test 
procedure)- Thus recommendations for curriculum development include 
providing variety through puzzles, games, contests, stories and story 
characters, providing activities which involve the children measuring 
objects in the room, providing problems to be solved and providing time 
for the children to discuss what they have observed- 

Recommendations for Future Research 

1) The final teach-test procedure did not produce the third level 
of aptitude- It might be worthwhile to replica-e this procedure with 
another sample- There are three recommendations for changing the sample: 

a) Select from third and fourth graders instead of 
from second and third graders. 

b) Select from a DMP population; that is, select from 
a population which is familiar with a measurement 
approach and the type of activities presented in 
the treatments. 

c) Select a larger sample of second and third graders 
from a different environment - 

2) The retention data showed an increase from the achievement test- 
k future study should look a^ . retention test given at a later date- 
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3) The Level I students in Treatment U were highly successful in 
handling the relationships among units of area upon the completion of 
the treatment* An interesting transfer measure would be the repetition 
of the test involving units of length used in the teach-test procedure* 

4) If the treatments or instruments are to be used in further 
studies, changes should be made according to the remarks found in the 
conclusion section of this chapter or in the journa^ i Appendix C. In 
particular. Treatment N should be strengthened by placing more emphasis 
on comparing inexactly covered regions • 

5) The child's ability to learn about other units of measure needs 
further investigation. Likewise, the ability of transfer from one 
attribute to another needs more careful examination. By varying the 
attributes and obtaining the third level of aptitude the design ot this 
study could be used for such investigations. 

6) The teach-test procedure was successful in producing two dis- 
tinct groups. With the collection of additional deta (IQ, teachers' 
rrtings, etc.) the following interpretations of Hcimer and Lottes' 
hypotheses could be tested: 

a) This determination of the aptitude levels is a better 
predictor of success for each of the criterion variables 
than conventional procedures (IQ, teachers* ratings). 

b) This determination of the aptitude levels measured 
factors not taken into account by conventional procedures 
for predicting success for each of the criterion variables. 

c) This determination of the aptitude levels was a better 
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predictor of success for each of the criterion 
variables for Treatment U than for Treatment N» 
d) This determination of the aptitude levels was a 

better predictor of success for each of the criterion 
variables for Level II than for Level I. 

7) Defining aptitude as the ability to learn is not a common 
approach. Although no significant interactions between aptitude and 
treatment were found, the investigator feels that this approach is 
worth pursuing* The treatments and dependent observations need to be 
refined and the third level of aptitude needs to be obtained, before 
drawing any conclusions about using this method of determining aptitudes 
in an ATI study • 

8) Performance on Piagetian type tasks related to area and length 
might profitably be investigated before and after this experiment. 
Although instruction has not often proved to change performance on 
Piagetian tasks, the subjects in Treatment U appeared to be coordinating 
the unit with the number and not centering on only one dimension. This 
ability may make a difference in the performance on typical area and 
length Piagetian tasks. 

Concluding Remarks 

This study was made in response to questions raised by a curriculum 
development project. Although the hypothesized interaction between 
aptitudes and treatments was not found, many results were relevant to 
the development of curriculum materials. Not all questions associated 
with curriculum de- ilopment may be answered by research, but many more 
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questions that are now being answered through research can be end 
need to be. Thus, there is a need for further research of this type 
to be made in conjunction with developing curriculum. 
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